/ \ 1o \
US 20040170284A1
(19) United States
(12) Patent Application Publication (10) Pub. No.: US 2004/0170284 A1
(43) Pub. Date:
J anse et al.
(54) SOUND REINFORCEMENT SYSTEM
Sep. 2, 2004
Publication Classi?cation
HAVING AN ECHO SUPPRESSOR AND
LOUDSPEAKER BEAMFORMER
(76) Inventors: Cornelis Pieter Janse, Eindhoven
(NL); Harm Jan Willem Belt, Leuven
(BE)
Int. Cl.7 ............................. .. H04R 3/00; H04B 3/20
(52)
US. Cl. ............................................... .. 381/66; 381/92
(57)
ABSTRACT
A sound reinforcement system (1) comprises several micro
_
21
(51)
phones (2), a microphone beamformer (5) coupled to the
ggi‘rgzggifiilgtilidggesisel
microphones (2), adaptive echo compensation (EC) means
Philips Electronics North America corpora tion
(4) coupled to the microphone beamformer (5) for generat
P 0 BOX 3001
Briarcli?. Manor NY 10510 (Us)
’
ing an echo compensated microphone signal, and several
loudspeakers (3) coupled to the adaptive EC means
The
sound reinforcement system (1) further comprises an adap
l. N .._
A
tive loudspeaker beamformer (11) coupled betWeen the
10 483 854
( )
pp
0
(22) PCT Filed:
/ ’
Jun_ 24 2002
’
geously the adaptive loudspeaker beamformer creates a
(86)
PCT/IB02/02576
beam pattern Which is capable of creating a “null” in the
PCT No;
adaptive EC means (4) and the loudspeakers (3) for shaping
the directional pattern of the loudspeakers
Advanta
direction of speaker(s) such that hoWling is effectively
(30)
Foreign Application Priority Data
prevented. The loudspeaker beamformer (11) may for
example be a Weighted Sum Beamformer, a Delay and Sum
Jul. 20, 2001
(EP) ...................................... .. 012027918
Beamformer or a Filtered Sum Beamformer.
11
!
1
/ \
1o
X(")
V
12
"9
y1 (I1)
H")
5“
l/(nl
6\
2.
4 vs n)
(n)
X1(") [G
....
\
=
X201) [K]
q__
'
ism)
/
)
..
‘
X(")
v
"4
z1(")
2201) Z \
r1 (n)
p
4
q(n)
I
a
(
7
fin)
<
K2
g
l'
Z501)
r in)
S
...
i /
Sep. 2, 2004
US 2004/0170284 A1
SOUND REINFORCEMENT SYSTEM HAVING AN
ECHO SUPPRESSOR AND LOUDSPEAKER
BEAMFORMER
[0001]
The present invention relates to a sound reinforce
former coefficients, such that the combined loudspeaker
beam pattern and the combined microphone beam pattern
are complementary.
[0010] It is advantage of the sound reinforcement system
ment system comprising at least one microphone, adaptive
according to the invention that such an embodiment reduces
echo compensation (EC) means coupled to the at least one
the unWanted coupling betWeen the loudspeaker beam Which
is directed to the speaker and the microphone beam in the
microphone for generating an echo compensated micro
phone signal, and at least one loudspeaker coupled to the
adaptive EC means.
[0002] Such a sound reinforcement system is knoWn from
applicants US. Pat. No. 5,748,751. The knoWn sound rein
forcement system is provided With a microphone, adaptive
echo compensation (hereafter indicated EC) means in the
form of an adaptive echo canceller ?lter coupled to the
microphone for generating an echo compensated micro
phone signal. The system further has a loudspeaker and an
ampli?er coupled to the adaptive EC means.
[0003]
It is a disadvantage of the knoWn sound reinforce
vicinity of the speaker or speakers. This results in a reduced
disturbing sound level, such that only a minimum amount of
sound is directed to the active speaker.
[0011] A still further embodiment of the sound reinforce
ment system according to the invention is characteriZed in
that the sound reinforcement system comprises a dynamic
echo suppressor (DES) coupled betWeen the microphone
beamformer and the adaptive loudspeaker beamformer for
suppressing remaining echoes by using a time delay betWeen
the amplitudes of a microphone signal frequency component
and the same remaining echo frequency component.
ment system that if tWo or more loudspeakers are connected
[0012]
to the sound reinforcement system the output sound quality
leaves much to be desired, in particular in terms of sound
direction, echo and/or reverberation.
system according to the present invention that the applica
tion of the Dynamic Echo Suppressor or DES opens possi
bilities for tailoring the echo cancellation such that speaker
[0004]
Therefore it is an object of the present invention to
provide an improved sound reinforcement system capable of
effectively tailoring sound direction, echo and reverberation
properties, While still canceling various types of echoes, in
It is an advantage of this sound reinforcement
room impulse responses, as Well as variations therein due to
people moving in the room are noW included in the echo
canceling process. This is mainly due to the fact that the
DES essentially operates in the time domain for identifying
a time delay betWeen amplitudes of a multi microphones
particular in cases Wherein a plurality of loudspeakers is
used.
signal frequency component and its associated remaining
[0005] Thereto the sound reinforcement system according
fore be ?ltered out more effectively Which results in an
to the invention is characteriZed in that the sound reinforce
enhanced speech intelligibility for sound reinforcement sys
tems. This is particularly important for hands-free sound
reinforcement systems, Where people tend to Wonder around
in the room, and consequently echo and reverberation prop
erties of the room may vary considerably. These varying
ment system further comprises a microphone beamformer
coupled to the adaptive EC means; and an adaptive loud
speaker beamformer coupled betWeen the adaptive EC
means and several of the loudspeakers for shaping the
directional pattern of the loudspeakers.
[0006]
It is an advantage of the sound reinforcement
system according to the present invention that by shaping
the directional pattern of the loudspeakers, possibly also for
eXample in dependence on the echo and/or reverberation
properties of a room or hall, the audibility of the system can
be improved. Also the direction of the sound produced by
echo frequency component. The remaining echo can there
properties are noW included in the improved echo cancel
lation and in addition reduces the chances that hoWling due
to feedback from loudspeaker(s) to microphone(s) may
occur.
[0013] An embodiment of the sound reinforcement system
according to the invention is characteriZed in that the DES
is a dynamic echo noise suppressor (DENS).
the loudspeakers can be made dependent on the position or
an area of expected movements of the speaker or speakers
[0014]
carrying the microphone or microphones respectively. Spe
tral subtraction for suppressing stationary noise, While use is
being made of the short time poWer of magnitude spectra of
ci?cally the sound output can be made minimal at a respec
Such a DENS advantageously makes use of spec
tive speaker position. Advantageously the loudspeaker
its input signals.
beamformer may create a beam pattern Which is capable of
creating a “null” in the direction of the speaker(s) such that
[0015] Another further embodiment of the sound rein
forcement system according to the invention is characteriZed
in that the sound reinforcement system comprises a decor
relator coupled betWeen the adaptive EC means and the
adaptive loudspeaker beamformer for decorrelation of the
hoWling is effectively prevented.
[0007] Several possible embodiments of the sound rein
forcement system according to the invention are character
iZed in that the adaptive loudspeaker beamformer (11) is a
Weighted Sum Beamformer, a Delay and Sum Beamformer
or a Filtered Sum Beamformer.
[0008] Advantageously these embodiments link up closely
With beamformer techniques already knoWn per se.
[0009] A further embodiment of the sound reinforcement
system according to the invention is characteriZed in that the
adaptive loudspeaker beamformer is coupled to the micro
phone beamformer, While both beamformers have beam
microphone signal.
[0016] Because the adaptive EC means Will try to remove
any auto-correlation in the speaker signal, a decorrelator is
included in the sound reinforcement system according to the
invention, in order to prevent a “Whitening” of the Wanted
speaker signal.
[0017] A still further embodiment of the sound reinforce
ment system according to the invention is characteriZed in
that the sound reinforcement system comprises a limiter
Sep. 2, 2004
US 2004/0170284 A1
coupled between the adaptive EC means and the adaptive
loudspeaker beamformer for limiting gain in the sound
reinforcement system.
[0018] It is an advantage of the sound reinforcement
system according to the invention that the system remains
stable even if ampli?er gains are suddenly enlarged and
microphones and/or loudspeakers are moved around in a
room. Furthermore it additionally prevents hoWling in
abnormal situations, by decreasing the roundtrip gain.
[0019] Still another embodiment of the sound reinforce
ment system according to the invention is characteriZed in
that the sound reinforcement system comprises an equaliZer
coupled betWeen the decorrelator and the adaptive loud
speaker beamformer.
also guarantees that the microphone signal has a good SNR
and that direct sound ?eld component dominates the diffuse
sound ?eld component, i.e. the microphone signal does not
sound reverberated.
[0028]
In a number of applications the participants do not
Want to have the microphones 2 close to their mouth and do
not Want to push a button once they Want to speak. An
example is a boardroom conference, Where people are sitting
around a large table and Want to Work and communicate
Without being hindered by communication equipment. This
is possible by placing the microphones 2 and loudspeakers
3 further aWay and alloW simultaneous talking. Another
application is conferencing Within a car. Due to the large
background noise and the position of the driver and the
passengers the speech intelligibility is usually loW. An
[0020] Advantageously the equaliZer ?attens a possibly
attractive solution here is to locate microphones 2 in the
coarse frequency characteristic of the path betWeen the
neighborhood of the participants (in the ceiling for example)
loudspeakers and the listener(s).
and use the distributed loudspeakers 3 of the audio system
[0021] The sound reinforcement system according to the
Within the car.
invention, Which may be a hands-free system may advan
[0029] In the above-mentioned situations additional signal
processing has to be applied to guarantee that at the required
tageously be embodied as a public address system, a con
gress system, a conferencing system, or a communication
system such as a passenger communication system for a
vehicle such as a car, aeroplane or the like.
[0022] At present the sound reinforcement system accord
ing to the invention Will be elucidated further together With
its additional advantages, While reference is being made to
the appended draWing, Wherein similar components are
being referred to by means of the same reference numerals.
In the draWing:
[0023]
FIG. 1 shoWs a schematic diagram of a fully
equipped sound reinforcement system With the help Whereof
several possible sub embodiments of the system Will be
sound pressure levels no hoWling occurs and that the speech
that is picked up by the microphones 2 is enhanced, i.e. the
background noise is removed and reverberation of the
desired speech signal is suppressed.
[0030] A similar problem is encountered With systems 1
like loudspeaking (or hands-free) telephony and video con
ferencing systems. Also then the user Wants to move around
freely and does not Want to be bothered by the communi
cation equipment. The latter includes that the connection is
full-duplex. Signal processing is needed then to remove the
acoustic echoes and reverberation of the desired speech, and
additional processing may be needed to remove the back
elucidated;
ground noise.
[0024] FIG. 2 shoWs possible embodiment of a Dynamic
Echo Suppressor (DES) for application in the sound rein
forcement system of FIG. 1; and
canceling (EC) ?lter means 4. Within this ?lter means 4 the
[0025]
FIG. 3 shoWs amplitude versus time graphs of a
near end signal (solid line) and an echo signal (dotted line)
respectively for explaining the operation of the DES of FIG.
2.
[0026] FIG. 1 shoWs a block diagram of a total sound
reinforcement system 1. The system 1 may range from a
public address system Where only one speaker addresses a
large audience to a congress system Where the role of
listener and speaker changes continuously among partici
pants. The system 1 comprises one or more microphones 2
and one or more loudspeakers 3. Together With appropriate
signal processing it is possible to create radiation patterns
for both a loudspeaker array 3 and a microphone array 3.
[0027] In all applications of such a system 1 the aim is to
enhance the speech intelligibility. Without such a system the
speech intelligibility is often too loW because of a loW
Signal-to-Noise Ratio (SNR) or because the reverberation is
too high. Without extra measures the microphone(s) 2 that
are used have to be close to the mouth of the participants and
only one speaker can be active at a certain time. Only then
it can be guaranteed that the acoustic feedback betWeen the
loudspeaker(s) 3 and the microphone(s) is loW and that no
hoWling occurs at suf?ciently high sound output poWers. It
[0031] The system 1 further comprises adaptive echo
transfer function of each loudspeaker-microphone pair is
estimated and With this transfer function the echo ys(n) (With
s the channel index) in each microphone signal ZS(I1) can be
estimated and subsequently be subtracted from each micro
phone signal. The relating signal is called the residual signal
rs(n). The outputs of the adaptive ?lter means 4 contain for
each channel s both the estimated echo ys(n) and the residual
signal rs(n).
[0032] The system 1 also comprises a microphone beam
former 5 coupled to the ?lter means 4. The task of this
beamformer 5 is to focus the beam on the active speaker, that
is the input signals rs(n) are ?ltered (or Weighted) and
summed together in such a Way, that the active speaker
signal is emphasiZed, and reverberation and possibly back
ground noise are suppressed. The ?lter coef?cients (or
Weights) are determined adaptively, but it requires that
during adaptation there is no (strong) echo. Contrary to the
conferencing applications, Where We can adapt the micro
phone beamformer 5 When only the near-end speaker is
active, We noW alWays have double talk and have to remove
the echoes ?rst. The microphone beamformer 5 has as inputs
the residual signals rs(n) and delivers an enhanced signal r(n)
at its output 6. In addition the estimated echoes ys(n) are
treated in exactly the same Way as the residual signals rs(n),
giving the output signal y(n). The signal y(n) is needed by
Sep. 2, 2004
US 2004/0170284 A1
more loudspeakers 3. The loudspeaker beamformer 11 can
a Dynamic Echo Suppressor (DES) 7, Which may be a
Dynamic Echo Noise Suppressor (DENS), as Will be
be used to create a beam pattern that focuses on the listeners.
explained hereafter.
It may then take information from the microphone beam
[0033]
The DES 7 suppresses the remaining echoes and
embodied as DENS7 also suppresses (stationary) noise
former 5 and is then able to achieve a null in the direction
of the speaker.
components, Without distorting the near-end signal (if pos
[0038] Although problems betWeen sound reinforcement
sible). Within the residual signals there Will alWays be some
remaining echoes for the folloWing reasons. First, the num
systems 1 applied as handsfree teleconferencing systems and
ber of coef?cients of the adaptive ?lters 4 are too small to
model the room impulse responses completely, and secondly
the adaptive ?lter 4 is not able to track the variations in the
impulse response When people are moving. The DENS7 has
strong similarities With spectral subtraction for stationary
noise suppression and uses the short-time poWer or magni
tude spectra of y(n), r(n) and Z(n) respectively, Where Z(n) is
calculated Within the DENS as Z(n)=y(n)+r(n) and can be
seen as the output 6 of microphone beamformer 5 With the
signal 4(n) as inputs of the ?lters 4. The requirements for the
DENS 7 are much stronger When compared With telecon
ferencing. With teleconferencing possible distortions of the
far-end speaker due to the DENS at the far-end side are
masked by the near-end speaker itself. Moreover, double
talk does not occur often in teleconferencing applications.
With sound reinforcement systems 1, there is alWays double
talk and the loudspeaker output perceived by the listeners is
generally much stronger than the near-end speaker and as a
result, possible artifacts are not masked by the near-end
“handsfree” sound reinforcement systems are similar there
are three aspects Which Will be mentioned here that make the
sound reinforcement case technically more dif?cult:
[0039]
1) The adaptive ?lter 4 that is used to remove the
estimated echo is never able to learn in a situation
Where the echo is not disturbed by a near-end speaker.
This is because the near-end speaker acts as the driving
force for the loudspeaker signal, Whereas in a telecon
ferencing case the far-end speaker acts as the driving
force.
[0040] 2) There is continuously a situation of double
talk, being the most dif?cult situation. In a teleconfer
encing application most of the time either the far-end
talker or the near-end talker is active. If during double
talk, the far-end talk is a little distorted, because of
inappropriate echo cancellation at the far-end side, this
is easily masked by the near-end speaker. This holds for
the near-end speaker himself, but also for listeners in
the near-end room. With sound reinforcement systems
speaker.
the perceived loudspeaker signal is much stronger and
[0034] The system 1 may also comprise a limiter 8. To
guarantee that the system 1 remains stable even if ampli?er
much less use can be made of the masking effect
gains are suddenly enlarged and microphones 2 and/or
loudspeakers 3 are moved, a limiter 8 is added to the system
1. Its task is to prevent hoWling in abnormal situations, by
decreasing the gain.
[0041] 3) Algorithmic delay should be minimiZed. The
total delay betWeen the microphone signal and the
loudspeaker signal should be less than ten msec.
[0035] A decorrelator 9 Will also be included in the sound
reinforcement system 1. A decorrelator Will generally be
necessary for proper operation of the adaptive ?lter 4. The
adaptive ?lter 4 tries to decorrelate its residual signal r, With
[0042] A general architecture for a “hands-free” sound
reinforcement system 1 is proposed that copes With the
dif?culties just mentioned. HoWever the architecture dis
closed alloWs various modi?cations, also the ones already
mentioned above.
its input signal X. Without a decorrelator 9 X is just a scaled
version of r and, as a result, the adaptive ?lter 4, tries to
remove the autocorrelation of the desired speaker, i.e. tries
dependence on the speci?c arrangement as to the number of
microphones 2 and loudspeakers 3 Which are included in the
to “Whiten” the desired speaker. By applying a decorrelator
sound reinforcement system 1. Such speci?c arrangements
[0043]
The adaptive ?lter section 4 Will be embodied in
We can solve this problem. It is essential of course, that the
having one microphone and one loudspeaker, one micro
decorrelation does not change the perceptual quality of the
desired signal. For speech signals a decorrelator 9 embodied
phone and several loudspeakers, several microphones and
as a frequency shifter is a very good candidate. With a shift
one loudspeaker, or several microphones and several loud
speakers are knoWn per se in the prior art.
of about 5 HZ, the decorrelation properties are good, per
ceptual quality remains good and it even helps to keep the
total system 1 stable in situations Where the acoustic path is
[0044] The microphone beamformer 5 has the task to
focus the beam on the active speaker by ?ltering or Weight
suddenly changed.
[0036] An equaliZer 10 may also be included in the system
1. Details of such an equaliZer are set out in applicants
published International patent application WO 96/32776, the
content Whereof is included here by reference thereto. With
the equaliZer 10 the coarse frequency characteristic of the
loudspeaker-listener path(s) is (are) ?attened. When the
loudspeaker(s)-microphone(s) paths are a good estimate for
this (usually the case When the loudspeaker(s) 3 and micro
phone(s) 2 are not close together), then also information
from the transfer functions from the adaptive ?lter 4 can be
used to automatically adapt ?lters present in the equaliZer.
[0037]
In another possible embodiment the system 1 com
prises a loudspeaker beamformer 11 in case there are tWo or
ing the different inputs and summing them together in such
a Way that the active speaker signal is emphasiZed and that
the background noise and reverberation is suppressed. In
some applications it is important that an adaptive beam
former is available that can track a moving speaker. The
most Well-knoWn adaptive beamformer is a Delay-and-Sum
beamformer, Where it is assumed that the desired speech
signals in the microphone signals are delayed versions of
each other, depending on the direction of arrival. By corre
lating the microphone signals the delays can be determined
and, for spatially White noise, a logarithmic attenuation can
be obtained. The free ?eld assumption on Which the Delay
and-Sum beamformer is based, is often not valid in practice.
Especially if the microphone array 2 is placed close to other
objects, like a table or a Wall or is placed on top of a monitor,
Sep. 2, 2004
US 2004/0170284 A1
the speech signals are not just delayed versions of each other
[0048] With 1B the frame number, ye the subtraction factor
but also contain severe re?ections and reverberation. Deter
for the echo term, and |Yr(k;lB)| an estimate of the residual
echo magnitude to compensate for the fact that the adaptive
?lter has too feW coef?cients to model the complete (in?nite
mination of the delays is not obvious then and the overall
performance is not optimal. Alternative adaptive beamform
ers are a Weighted Sum Beamformer (WSB) and a Filtered
Sum Beamformer (FSB). Details of such adaptive beam
formers are set out in applicants published International
length) room impulse response. To prevent G(k;lB) to
change to rapidly betWeen iterations We apply a loW-pass
recursion according to:
patent application WO 99/27522, the content Whereof is
included here by reference thereto. Within the WSB each
ticularly suited for applications Where the microphones 2
[0049] Thus, in frequency bands With a strong far-end
echo (Y is an estimate of the echo) When compared With the
near-end signal the residual R is attenuated, and in bands
Where the near-end signal is much stronger than the far-end
point aWay from each other, or in applications Where the
echo the residual remains approximately the same. With
microphones 2 are far aWay from each other. With the FSB
each microphone signal is ?ltered With an FIR ?lter and
summed. Also here the Weights are adaptively determined in
such a Way that the output poWer is maximiZed under a
certain constraint. The Filtered Sum Beamformer is espe
cially suited for cases Where the microphones all pick up a
teleconferencing applications use is made of the assumption
that the short-time spectrum of the far-end signal differs
from the short-time spectrum of the near-end signal and We
can suppress the echo components Without suppressing the
near-end signal. With sound reinforcement systems the
situation is different. The spectrum of the near-end speech
does not differ signi?cantly from the spectrum of the echo,
since the near-end speaker is the driving force. The differ
microphone signal is Weighted and summed. The Weights
are (adaptively) determined such that the output poWer is
maximiZed under certain constraints. Such a WSB is par
signi?cant portion of the sound together With ?rst re?ec
tions. The FSB ?lters automatically compensate for the
delays and ?rst re?ections. The WSB and FSB ?lters 5 can
be extended to so-called Generalized Sidelobe Cancellers.
ence in time-scale betWeen the near-end speech and the
Apart from the enhanced speech signal the WSB and FSB
[0050] In FIG. 3 the magnitude for a certain frequency
component of the microphone signal is given as a function
of time. The solid line depicts the near-end signal Whereas
the dotted line gives the echoes. The echoes start after the
near-end signal due to the processing delay, and the acoustic
can be extended With additional outputs that contain mainly
noise. The outputs can serve as reference inputs for a
subsequent multichannel adaptive noise canceller, Where the
enhanced speech output of the beamformer serves as pri
mary input. In this Way the noise can be further reduced.
[0045] The Dynamic Echo Suppressor (DES) 7 Which may
possibly be extended to a Dynamic Echo Noise Suppressor
(DENS) 7 can successfully be used for acoustic echo can
celing. With reference to FIG. 2 a brief description of its
operation folloWs, but ?rst some notational conventions
used hereafter Will be given.
[0046]
1, . . .
The sampling index is denoted by n (n=. . . , 1, 0,
We use block processing Where a real-valued
echoes can hoWever be used.
propagation delay betWeen the loudspeaker and the micro
phone. The decay is determined both by the reverberation
time of the room and the open loop gain of the system. Let
us noW check hoW the DES reacts in this case: |Y(k;lB)|+
|Yr(k;lB)| is an estimate of the echo (the dotted line in FIG.
3). When the estimate is accurate and the echoes are uncor
related With the near-end signal and We Would have sub
tracted the squared estimate from the squared Z-signal then
the result Would be equal to the squared near-end speech
signal. The estimate is not so accurate hoWever and experi
discrete time signal x(n) is segmented according to x(BlB—
ments have shoWn that We can take as Well the amplitudes
1), With B the data block siZe, 1B the block index according
x(BlB). The M-points DFT result of X is denoted by X(k;lB)
together With oversubtraction (ye >1). If We oversubtract the
echo then it folloWs from FIG. 3 that only the decay of the
near-end speech is distorted. During the attack and after the
decay there Will be no distortion. During the decay the
With k the frequency index (k=0, 1, . . . , M-1). Note that
With real-valued time-domain data We do not need to con
distortion is not so important. Because of the reverberation
in the room We can even say that the decay of the speech is
to lB=[n/B] (here
denotes integer truncation), and 1=0, 1,
. . . , B-1. Thus the neWest available data sample of x(n) is
sider negative frequencies in a practical implementation, but
already distorted by this reverberation. Experiments have
for notational convenience We Will here continue to do so.
shoWn that there is indeed some dereverberation effect When
F
lrhnpaulse Response and IIR for In?nite Impulse Response, N
We apply some oversubtraction. The larger the loop gain is
the more important it is that the combination of adaptive
denotes the number of the FIR ?lter coefficients.
?lter and DES subtracts or suppresses the echoes. At very
[0047] The DES 7 (We leave out the noise component for
a moment) takes as its input segmented time frames and
transforms these frames into magnitude spectra, denoted by
large gains (up to 20 dB!) stability is more an issue than
some distortion during the decay of the near-end speech, as
opposed to the situation Where the loop gain is less than one.
For this reason ye depends on the loop gain. The loop gain
can directly be obtained from the Weights of the adaptive
?lter means 4, since they represent the frequency character
istic betWeen the microphone 2 and loudspeaker 3 and
determine the open loop gain if the rest of the system has a
gain of unity. ye is chosen smaller than one if the maximum
loop gain is smaller than one and larger than one if the
maximum loop gain is larger than one.
[0051] Another problem to be addressed is the algorithmic
delay of the DENS. Normally, the DENS is a linear phase
is the sampling rate in HertZ, FIR stands for Finite
|Y(k;lB|, |Z(k;lB|, and R(k;lB|. It next applies a frequency
dependentv(non-negative) attenuation G(k;lB) to |R(k;lB)|
yielding |R(k;lB)|. The time-domain signal q(n) is recon
svtructed by an inverse spectral transformation on |
R(k;lB)|exp{—jq)R(k;lB)}, With j¢R(k;lB) the phase vof the
residual spectrum |R(k;lB)|. The attenuation function G(k;lB)
is calculated as folloWs. First per frame an attenuation
function G(k;lB) is calculated according to:
Sep. 2, 2004
US 2004/0170284 A1
?lter and gives an extra delay that equals the data block
done off-line. A White or pink noise source is used as
length B of the DES. If a DENS is implemented as a
excitation source and a microphone is placed at the position
minimum-phase ?lter then no extra delay is added.
of the listener. The response is measured in octaves or
[0052] The task of the limiter 8 is to reduce the gain of the
system in case the system 1 becomes unstable, due for
example to the movement of a microphone or loudspeaker,
or to the sudden increase of the loudspeaker volume. It is
especially important if the system is designed for operation
far above hoWling. In such a situation the echoes are much
stronger than the signal of the near-end speaker and the gain
of the microphone preampli?er is determined by the echo.
As a result after compensating the echoes With the adaptive
?lter 4 and the DES or DENS7 there Will be a huge
head-room for the near-end speech. A limiter may then be
necessary to reduce the gain, if the echoes are not compen
sated Well, during drastic changes in the loudspeaker-mi
crophone path(s). The limiter function itself is a standard
one. The limiter gain may be the product of tWo gains: an
attack gain and a decay gain.
1/3-octaves and the equaliZer 10 is adjusted until a ?at (or
otherWise desired) response is obtained. If more listeners are
available (often the case) the procedure is repeated and an
average curve is obtained. A draWback of this method is that
the adjustment is ?xed. If the conditions change, (full or
empty room for example), no adjustments can be made
anymore. From experiments We have found that the fre
quency characteristic betWeen the loudspeaker 3 and micro
phone 2 (especially if the loudspeaker is not too close to the
microphone), When measured in octaves or 1/3-octaves, is
representative for the transfer function betWeen the loud
speaker and the participant(s). In such a situation We can use
the estimate of the adaptive ?lter 4 for adjusting the equal
iZer 10. The adjustment may be done automatically and
iteratively if the equaliZer 10 is placed after the input 12 of
the adaptive ?lter means 4 as is shoWn in FIG. 1. That is, the
adaptive ?lter 4 tries to estimate the transfer function of the
combination of the equaliZer 10 and the acoustic path. For
[0053] Normally G1 equals one. Once the smoothed poWer
P5 of the output signal q(n) exceeds a threshold Plimit, a gain
ratio GI is determined as:
G1=\/(Ps/P1imn)
[0054] and Gg is put equal to G1.
[0055] Ga and Gd are then given by:
Ga=(Gg/GI)+(Gg-(Gg/GI))eXP(-l/Ta)
a single loudspeaker—multiple microphone case the same
can be done. In that case one has to calculate an average
transfer function from the available transfer functions in the
adaptive ?lter 4. In case of a multiple loudspeaker—single
microphone case there are tWo possibilities: An equaliZer 10
can be placed in each loudspeaker path and the same
procedure can be used as for the single loudspeaker—single
microphone case, or an equaliZer can be placed before the
loudspeaker beamformer 11. When using the background
model concept of the adaptive ?lter 4 the transfer function
to be used for estimating the equaliZer coef?cients is given
[0057] Typical values for T3 and Tb are 0.01 and 5.0
seconds respectively. As a result G1 decreases rapidly toWard
Gg/Gr and subsequently groWs sloWly to 1 again.
[0058] As explained above a decorrelator is necessary to
prevent that the adaptive ?lter 4 tries to “Whiten” the desired
signal. Details of such a decorrelator are set out in applicants
US. Pat. No. 5,748,751, the content Whereof is included
here by reference thereto. For speech applications a fre
quency shifter performs very Well. When a frequency shift
of approximately 5 HZ is applied, it both decorrelates the
signal and helps to keep the system 1 stable as Well. The
by the sum of the individual transfer functions Weighted or
convoluted by the coefficients or FIR-?lters of the loud
speaker beamformer 11.
[0061]
With the loudspeaker beamformer 11 We are able to
shape the directional pattern of the loudspeaker array 3. As
Was the case With the microphone beamformer 5 also the
loudspeaker beamformer is adaptive. Contrary to the micro
phone beamformer 5, it is not obvious hoW to adapt the
loudspeaker beamformer, ie where the loudspeaker beam
former has to point to. Extra measures are necessary to let
the system 1 knoW Where the listeners are located. Possi
bilities are an attention button at the beginning of a meeting
frequency characteristic betWeen a loudspeaker 3 and a
microphone 2 in a room shoWs many peaks and dips. The
(conference application), video tracking using a camera to
average frequency spacing betWeen adjacent minima and
the loudspeaker con?guration a Weighted Sum Beamformer,
extract the positions of listeners and the like. Depending on
maxima is only a feW HZ. When a frequency shifter is
a Delay and Sum Beamformer or even a Filtered Sum
applied the average loop gain becomes important instead of
the maximum loop gain.
[0059] For gains With a maximum loop gain above 0 dB
Beamformer can be used. It is important that all individual
ampli?ers have the same gain and that there is one overall
and an average loop gain beloW 0 dB a system With a
frequency shifter, but Without an adaptive ?lter, remains
stable. The artefacts hoWever, are disturbing because of the
roundtrips of the sound (each time With a shift of 5 HZ)
through the loop. With an adaptive ?lter 4 (and a DE(N)S)
the attenuation provided by the adaptive ?lter is suf?cient to
suppress these artefacts.
[0060] In possible embodiments of the sound reinforce
ment system 1 a parametric equaliZer 10 is used to adjust the
gain adjustment. OtherWise the radiation pattern depends on
the differences in ampli?cation values of the individual
ampli?ers. If the information With respect to the listeners is
not available, then the beamformer still can be useful by not
pointing to the active speaker. For the speaker the sound that
is directed to him is not of any use, it is even disturbing.
Also, the acoustic coupling betWeen the loudspeaker beam
that is directed to the speaker and the microphone beam (also
directed to the speaker) Will be large in general. Reducing
this coupling Will improve overall system behavior. Note
that in this case the loudspeaker beamformer 11 is deter
equaliZer is used, ie the bandWidth increases With increas
mined by the settings of the microphone beamformer 5. If
for example both the microphone and loudspeaker beam
ing frequency. The adjustment of the equaliZer 10 is mostly
former are Weighted Sum Beamformers and the coefficients
frequency response. Often an octave or 1/3-octave band
Sep. 2, 2004
US 2004/0170284 A1
(W1, W2, . . . W5) of the microphone beamformer 5 are (1, 0,
delay block processing With a block siZe B of only 64
. . . . 0), then the coef?cients (W11, W12, . . . W15) of the
samples is used (When compared With 256 samples in the
audio conferencing application). As is depicted in FIG. the
programmable ?lter part of the adaptive ?lter 4, the beam
loudspeaker beamformer 11 Will be equal to (0, 1, . . . 1). In
addition it is to be noted that in this case equally indexed
loudspeakers and microphones cover the same acoustic area
in the room concerned.
[0062]
In this section three applications are described. The
?rst one has to do With a high-end speakerphone unit With
multiple microphones and a single loudspeaker. The second
one has to do With multiple units and the third one has to do
With a sound reinforcement system Within a car.
[0063] The speakerphone unit can be used for audio
conferencing applications. It is also possible hoWever to use
it for sound reinforcement in boardrooms. The block dia
gram of the processing is shoWn in FIG. 1. The Microphone
beamformer 5 in this case consists of a Weighted Sum
Beamformer that picks up the speech signal as is the case
With audio conferencing. Also in this case external micro
phones 2 can be used if the participants are far aWay from
the unit. The output of the beamformer 5 is fed through the
DES/DENS 7, the limiter 8, frequency shifter decorrelator 9
to the input 12 of the adaptive ?lter means 4, and after
passing the equaliZer 10 to the loudspeaker 3. If there is only
one loudspeaker 3, there is no need for a loudspeaker
beamformer 11. One might think of a speakerphone unit
With three loudspeakers, each pointing in the direction of a
corresponding microphone. A loudspeaker beamformer 11
coupled to the microphone beamformer 5 can be used then,
as explained above. The loudspeaker 3 emits the sound and
the adaptive ?lters 4 compensate for the echoes. In larger
former 5, the ?lter part of the DES/DENS 7, the limiter 8 and
the decorrelator 9 all operate on blocks of B samples.
Working With blocks in a closed loop system gives some
problems, unless there is someWhere a delay of at least B
samples. Due to a serial to parallel conversion in the
microphone path and the parallel to serial conversion in the
loudspeaker path the impulse response Will alWays contain
at least 2B samples. It is advantageous then to put a delay of
at least 2B samples in front of both the adaptive ?lter means
4, since this delay models the at least ?rst 2B samples of the
impulse response. For the ?lter length of the adaptive ?lter
N=2048 is chosen. For the adaptive ?lter means 4 itself both
an unconstrained Block Frequency Domain Adaptive Filter
(BFDAF) has been used as Well as a (constrained) Parti
tioned Block Frequency Domain Adaptive Filter (PBFDAF)
has been used. Thereto reference is again made to US. Pat.
No. 5,748,751. For the PFDAF a partition length of 512
coef?cients has been used. For the analysis part of the DENS
a data block siZe of 512 points is taken.
[0066] It is thus presented a “hands-free” sound reinforce
ment system that comprises an adaptive ?lter section 4, a
microphone beamformer 5, a dynamic echo suppressor DES
7 and possible noise suppressor DENS7 and a decorrelator
9. Optionally a limiter 8, an equaliZer 10 and a loudspeaker
beamformer 11 can be added. We presented tWo major
applications. The ?rst one deals With boardroom applica
tions, Where a board of directors needs a real handsfree
meeting rooms one sound unit is not enough. The extension
sound reinforcement system 1, Whereas the second one deals
microphones should then be replaced by other sound units.
With a hands-free sound reinforcement system 1 in a car
In such an application We have a master sound unit and one
or more slave sound units. In addition to the echo corrected
environment.
microphone signals from the slaves to the master, noW also
the loudspeaker signal from the master has to be transported
to the slaves. An extra Weighted Sum Beamformer (WSB)
may then be added betWeen the limiter 8 and the decorrelator
9 Which WSB sums (after Weighting) the cleaned echo signal
of the sound unit itself and the signals coming from the slave
sound units. The output signal that is send to the slave sound
units is obtained after the frequency shifter decorrelator 9.
[0064]
An interesting application is found in a car envi
ronment. The passengers at the back of the car often do not
understand the driver and the passengers in front of the car,
due to the orientation of the speakers and the background
noise. By placing a microphone 2 close to all participants
(eg in the roof of the car) and using the already existing
loudspeakers 3 in the car, a sound reinforcement system 1
can be setup as is depicted in FIG. 1. The adaptive beam
former 5 is again a WSB that acts as a fast microphone
selector, the DENS does not only suppress the residual
echoes but also the stationary noise. We can Work With a
single loudspeaker—multiple microphone con?guration, but
We can also introduce a loudspeaker beamformer 11 and
suppress the loudspeaker that is used for the person that
speaks. In that case We need the adaptive background model
concept as Was explained in the above.
[0065] In this section some implementation details are
given for a sound system 1 With only one loudspeaker 3 and
Without an equaliZer 10. A system has been developed With
a sample frequency of 16 kHZ. To reduce the algorithmic
[0067]
Whilst the above has been described With reference
to essentially preferred embodiments and best possible
modes it Will be understood that these embodiments are by
no means to be construed as limiting examples of the devices
concerned, because various modi?cations, features and
combination of features falling Within the scope of the
appended claims are noW Within reach of the skilled person.
1. A sound reinforcement system (1) comprising at least
one microphone (2), adaptive echo compensation (EC)
means (4) coupled to the at least one microphone (2) for
generating an echo compensated microphone signal, and at
least one loudspeaker (3) coupled to the adaptive EC means
(4), characteriZed in that the sound reinforcement system (1)
further comprises a microphone beamformer (5) coupled to
the adaptive EC means (4); and an adaptive loudspeaker
beamformer (11) coupled betWeen the adaptive EC means
(4) and several of the loudspeakers (3) for shaping the
directional pattern of the loudspeakers
2. The sound reinforcement system (1) of claim 1, char
acteriZed in that the adaptive loudspeaker beamformer (11)
is a Weighted Sum Beamformer, a Delay and Sum Beam
former or a Filtered Sum Beamformer.
3. The sound reinforcement system (1) of claim 1 or 2,
characteriZed in that the adaptive loudspeaker beamformer
(11) is coupled to the microphone beamformer (4), While
both beamformers (11 and 4) have beamformer coef?cients,
such that the combined loudspeaker beam pattern and the
combined microphone beam pattern are complementary.
Sep. 2, 2004
US 2004/0170284 A1
4. The sound reinforcement system (1) of any of the
claims 1-3, characterized in that the sound reinforcement
system (1) comprises a Dynamic Echo Suppressor (DES 7)
coupled betWeen the microphone beamformer (4) and the
adaptive loudspeaker beamformer (11) for suppressing
remaining echoes by using a time delay betWeen the ampli
tudes of a microphone signal frequency component and the
same remaining echo frequency component.
5. The sound reinforcement system (1) of claim 4, char
acteriZed in that the DES (7) is a dynamic echo noise
suppressor (DENS).
6. The sound reinforcement system (1) according to one
of the claims 1-5, characteriZed in that the sound reinforce
ment system (1) comprises a decorrelator (9) coupled
betWeen the adaptive EC means (4) and the adaptive loud
speaker beamformer (11) for decorrelation of the micro
phone signal.
7. The sound reinforcement system (1) according to one
of the claims 1-6, characteriZed in that the sound reinforce
ment system (1) comprises a limiter (8) coupled betWeen the
adaptive EC means (4) and the adaptive loudspeaker beam
former (11) for limiting gain in the sound reinforcement
system
8. The sound reinforcement system (1) according to one
of the claims 1-7, characteriZed in that the sound reinforce
ment system (1) comprises an equaliZer (10) coupled
betWeen the decorrelator (9) and the adaptive loudspeaker
beamformer (11).
9. The sound reinforcement system (1) of any of the
claims 1-8, characteriZed in that the sound reinforcement
system (1), Which may be a hands-free system is embodied
as a public address system, a congress system, a conferenc
ing system, or a communication system such as a passenger
communication system for a vehicle such as a car, aeroplane
or the like.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement