Virtual Source Panning Using Multiple

19th European Signal Processing Conference (EUSIPCO 2011)
Barcelona, Spain, August 29 - September 2, 2011
VIRTUAL SOURCE PANNING USING MULTIPLE-WISE VECTOR BASE
IN THE MULTISPEAKER STEREO FORMAT
Se-Woon Jeon, Young-cheol Park∗ , Seok-Pil Lee∗∗ , and Dae Hee Youn
School of Electrical & Electronic Engineering, Yonsei University, Seoul, Korea
& Telecommunications Engineering Division, Yonsei University, Wonju, Gangwon, Korea
∗∗ Korea Electronics Technology Institute (KETI), Seoul, Korea
phone: + (82) 2-2123-4534, fax: + (82) 2-364-4870, email: jdotsw@dsp.yonsei.ac.kr
∗ Computer
ABSTRACT
In the last few decades, various panning algorithms have
been proposed to generate virtual sound localization in loudspeaker systems. Vector base amplitude panning (VBAP)
is the most widely adopted pair-wise amplitude panning
method. However, pair-wise amplitude panning has a directional discontinuity problem in the multichannel surround
panning. In particular, sound localization for the virtual
source does not smoothly vary in the direction of the near
speaker. While coincident panning using multiple loudspeakers, such as Ambisonics, performs better, it has less
stability in sound localization due to the precedence effect.
In this paper, a multiple-wise vector base virtual source panning algorithm is proposed. To generate more stable localization of panning sound, the proposed panning algorithm
calculates the amplitude panning gains using multiple-wise
vector base formulation. Additionally, the angle-dependent
and nonnegative gain control function is applied to prevent
artifacts caused by negative amplitude gains. The subjective
listening tests using the method of adjustment (MOA) are
performed to evaluate the sound localization of the proposed
algorithm, which is compared with the conventional amplitude panning methods.
1. INTRODUCTION
High-definition (HD) and ultrahigh-definition (UHD) video
formats, which provide high-quality optical resolution and a
wider angle view, are rapidly being applied to various multimedia systems. The audio format for HD and UHD video
must generate a wider sound field and more immersive sound
effects in order to optimally accompany these video formats.
However, the conventional stereo system cannot live up to the
desired sound quality in the latest multimedia systems. The
effort to generate a spatial localization of the sound source
in the multichannel loudspeaker system has tried and various
sound reproduction techniques had been suggested. Using
one loudspeaker for one sound source can create the most
natural and point-like sound source. But it is difficult to construct such a system because of spatial restrictions or the
insufficiency of the loudspeakers. Thus, the panning algorithms based on the psychoacoustic theorem, e.g. summing
localization, were proposed to acoustically create a directional perception of the sound source in the spatial sound
stage [1]-[9].
The most popular panning technique is pair-wise amplitude panning. When two signals with level differences due
to gain factors emanate from two loudspeakers, the sound localization of the virtual source is perceived by the listener
© EURASIP, 2011 - ISSN 2076-1465
Figure 1: Arbitrary angle-positioned multispeaker stereo format.
as being on the opposite side of the loudspeakers. The directional position of the virtual sound source is controlled
by the amplitude level difference in the horizontal stage between two loudspeakers. The sine law and the tangent law
are basic pair-wise amplitude panning methods in the stereophonic loudspeaker format [2, 3]. In 1997, Pulkki proposed
vector base amplitude panning (VBAP), which was not only
available with a symmetric stereo layout, but also for any
angle-positioned pair-wise loudspeaker setup [4]. However,
pair-wise amplitude panning, including VBAP, has the disadvantage that the virtual source is pulled in the direction of the
near speaker; this is called the detent effect [5]. It also has a
discontinuity problem with respect to the directional spread
of virtual source [6].
Pulkki proposed multiple-direction amplitude panning
(MDAP) to resolve the problem of pair-wise amplitude panning [6]. MDAP generates multiple panning vectors to make
the listener perceive a single virtual source. With MDAP, the
virtual vector of one virtual source is made by the superposition of two or more virtual vectors panned by VBAP. As
a result, the local minima of directional spread at the nearspeaker position can be prevented. The localization performance of MDAP has not been fully certified yet. Actually,
the localization performance of VBAP has been known to
be worse in the lateral side and the rear than in the frontal
stage in an angle-dependent manner [7]. Therefore, the localization of a single-perceived virtual source through use of
the multiple vectors in MDAP can be degraded by the lateral
panning vector.
In Ambisonics, spatial localization of the virtual source
is generated by using all of the multiple loudspeakers around
the listener [8, 9]. The gain factors of Ambisonics are allowed to be both positive and negative. It can achieve more
1337
correct localization of virtual source than pair-wise amplitude panning. But the directional quality is less robust to
the listener outside of the best position due to the precedence effect [10]. The minus gain can also cause the outof-phase problem [8]. Currently, Poletti proposed the panning functions for surround sound systems with nonuniform
loudspeaker layout; these panning functions produced a robust sound-field directionality in the ITU 3/2 configuration
[11]. It resulted better performance for sound localization,
but required additional computation in the asymmetric and
arbitrary angle-positioned loudspeakers layout.
In this paper, we present a multiple-wise vector base
panning algorithm that maintains the advantages of VBAP,
but provides more accurate localization of the virtual source
by concurrently using multiple loudspeakers. The proposed
panning algorithm is inspired from the vector base formulation of Pulkki’s VBAP, with a motive of Gerzon’s panpot law
in the multispeaker stereo setup as shown Fig. 1 [4, 5]. The
main advantages of the proposed algorithm are its stability
of the bias problem due to the precedence effect in the sound
localization, and the simplicity of the panning gain calculation for any arbitrary multichannel format. To prevent the
out-of-phase problem due to negative panning gain, a nonnegative gain control function is applied to the vector base
formulation of the gain factors.
The organization of paper is as follows. First, the multispeaker stereo setup and the vector base panning algorithms
are introduced in Section 2. The formulation of the proposed
panning algorithm with application of a nonnegative center
channel gain function is described in Section 3. In Section 4,
we evaluate and discuss the sound localization performance
of the proposed panning algorithm by comparing conventional panning techniques using subjective listening tests. Finally, it is summarized and concluded in Section 5.
2. MULTIPLE LOUDSPEAKERS FORMAT AND
VECTOR BASE AMPLITUDE PANNING METHODS
2.1 Multispeaker stereo setup
To obtain better performance of the spatial localization in the
frontal stage, the center loudspeaker is occasionally added
in the middle of the stereo format, e.g. the ITU 3/2 multichannel format. In this format as shown in Fig. 1, the panning algorithm of the frontal multiple loudspeakers should
be changed. Gerzon defined such a format as a multispeaker
stereo format, and also calculated the optimal panning gain,
that is, the panpot law for the multispeaker stereo, to satisfy
the coincident localization of the velocity vector and the energy vector [5]. Gerzon’s panpot law was certified as the optimal computation method, but it does assume that the loudspeaker angle for the left and the right speakers should be
same relative to the center direction, θ1 = θ2 of Fig. 1. Similarly, the loudspeaker format in Ambisonics must be symmetric to achieve optimal results [10]. The proposed amplitude panning algorithm is supposed that the loudspeakers’
configuration is arbitrary with respect to angle positioning,
that is, the layout could be either symmetric or asymmetric.
2.2 Vector base amplitude panning methods
The vector base formulation to calculate the panning gain is
introduced by Pulkki [4]. In the two-dimensional plane, the
vector matrix L 12 , which consists of a pair of loudspeaker
Figure 2: Optimal multiple-wise panning gain by pseudoinverse matrixing; solid line - symmetric layout (|θ1 − θ0 | =
|θ2 − θ0 |) and dashed line - asymmetric layout (|θ1 − θ0 | 6=
|θ2 − θ0 |).
vectors, is multiplied by the gain vector g and forms the virtual source vector p = [p1 p2 ]T .
p T = gL 12 ,
where g = [g1 g2 ], L 12 = [l1 l2 ]T .
(1)
The panning gain of two loudspeakers in the twodimensional space can be calculated by the inverse matrix
of L 12 as
g = p TL −1
12 .
(2)
In the multichannel format, the surround panning gains
are computed by VBAP’s formulation with paired loudspeakers. In the multispeaker stereo format as Fig. 1, the loudspeaker pairs are different at left stage (θ1 ∼ θ0 ) and right
stage (θ0 ∼ θ2 ). The directional perception of the virtual
source is point-like in the direction of the center loudspeaker.
Point-like perception of the virtual sound source is the best
result of sound localization, but can be affected by the directional spread, such as spatial blurring. The directional spread
increases in further directions to the loudspeaker [6, 10]. It
means that the directional spread varies with panning direction. For example, when the virtual source moves across the
center loudspeaker in the horizontal plane, the variation of
directional spread causes a degradation of spatial sound quality.
In MDAP, the variation problem of directional spread
is improved by generating the virtual source using multiple
loudspeakers [6]. It is implemented by the superposition of
multiple-panned virtual vectors. However, the pair-wise panning of VBAP was verified as the sound localization is worse
in the lateral or rear sides than in the frontal stage [7]. As
a result, the multiple directional-vectors created by MDAP
result in the different accuracy of localization. Thus, the perception of the virtual source can be biased in the direction of
the center because of the localization error for the more laterally positioned virtual vector. This effect is verified by the
subjective listening test in Section 4.
1338
Figure 3: Angle-dependent gain control function for nonnegative panning gain; solid line - symmetric layout (|θ1 − θ0 | =
|θ2 − θ0 |) and dashed line - asymmetric layout (|θ1 − θ0 | 6=
|θ2 − θ0 |).
Figure 4: Proposed multiple-wise panning gain with nonnegative gain control function; solid line - symmetric layout
(|θ1 − θ0 | = |θ2 − θ0 |) and dashed line - asymmetric layout
(|θ1 − θ0 | 6= |θ2 − θ0 |).
3. PROPOSED AMPLITUDE PANNING
ALGORITHM USING MULTIPLE-WISE VECTOR
BASE FORMULATION
VBAP [4]. But in Eq. (4) the panning gain at far directions
from the loudspeaker is negative as Fig. 2. Panning algorithms using multiple loudspeakers, e.g. Ambisonics, were
proposed to achieve the best perception of sound localization [5, 9]. Especially, Gerzon introduced the optimal panning gain to satisfy the perceptual angle of Gerzon vectors
in the multispeaker stereo setup [5]. To obtain the optimal
gain factors for the listener in the middle of loudspeakers,
these panning techniques allow for negative panning gains
for the loudspeakers far from the direction of virtual source.
However, negative panning gain occasionally causes an outof-phase effect, degrading the sound localization. To resolve
this problem, an in-phase panning algorithm that only used
positive panning gains was proposed in Ambisonics [8].
Therefore, we additionally applied the angle-dependent
gain control function in the procedure of vector formulation
of Eq. (4) to obtain the nonnegative panning gain.
In this section, we introduce the vector formulation to compute the optimal multiple-wise panning gain in the arbitrary
multispeaker stereo format and the nonnegative and controllable angle-dependent function to prevent the out-of-phase
effect by negative panning gain.
3.1 Multiple-wise vector base amplitude panning
First, to calculate the optimal multiple-wise panning gain,
we defined the multiple-vector matrix L 012 to compute the
coincident panning gain for the multispeaker stereo format.
The vector of virtual source p is then decided by the angledependent configuration of loudspeakers L012 with the vector
of panning gain g as
p T = gL 012 ,
"
where g = [g0 g1 g2 ], L 012 = [l0 l1 l2 ]T . (3)
It is similar to Eq. (1), but we employ pseudoinverse method,
also known as the Moore-Penrose inverse matrix, to compute
the inverse of non-regular square matrix L 012 [12]. By using
the multiple-wise vector base formulation, the optimal panning gain vector of multiple loudspeakers is
g = p TL +
012 ,
(4)
L0012
(5)
3.2 Nonnegative gain control function
In Section 3.1, the optimal panning gains using a multiplevector base formulation were calculated for any loudspeakers format, i.e. symmetric layout or asymmetric layout, as
where W (θ ) =
#
.
(6)
In Eq. (4), the vector matrix of loudspeakers, L012 , is substituted with L 0012 of Eq. (6). It means that the vector of the
middle loudspeaker is angle-dependently regulated by a nonnegative gain control function, δ (θ ). This control function
is defined as
X TX )−1X T . As
where the pseudoinverse matrixing is X + = (X
a result, the optimal panning gains in the loudspeaker format
of Fig. 1 are described in Fig. 2 with energy normalization
as
g
.
g scaled = q
2
g0 + g21 + g22
L012 ,
= W (θ )L
1
0
0
0 δ (θ ) 0
0
0
1
2πθ ϕ
)) ,
N −1
where N = 2|θ0 − θ p |, p = 1 or 2.
δ (θ ) = (0.5 − 0.5cos(
(7)
The panning angle θ is [−|θ0 − θ p |, +|θ0 − θ p |]. In Fig. 3,
the angle-dependent nonnegative gain control function in the
multispeaker stereo format as like Fig. 1 is described. This
function gradually reduces the loudness of the middle loudspeaker as the panning angle of the virtual source is further
from the center. The function also very effectively prevents
the negative panning gain and is also useful for incorporating minute adjustments of the virtual source’s panning angle.
1339
(a) bias error
(b) bias error
Figure 5: Subjective test results; bias error (degree) of panning angle. (a) symmetric layout (30◦ , 0◦ , −30◦ ) (b) asymmetric
layout (40◦ , 0◦ , −20◦ ).
The optimal value of ϕ , in Eq. (7) can be decided by the configuration of loudspeakers. If the angle-interval of the loudspeakers, i.e. |θ1 − θ0 | or |θ0 − θ2 |, increases, the larger value
is perceptually better to improve the quality of sound localization from degradation by the out-of-phase or by the detent
effect. So, in the asymmetric layout, i.e. |θ1 − θ0 | 6= |θ2 − θ0 |
in Fig. 1, the value ϕ of left side from the centered loudspeaker is experimentally decided to be larger than the right
side as like Fig. 3. In Fig. 4, the proposed panning gain
results for the symmetric and asymmetric layouts applied
with nonnegative center gain control function are described.
The formulation of the proposed panning algorithm is derived from the vector form of multiple loudspeakers and it
is termed multiple-wise vector base nonnegative amplitude
panning (MVBNAP).
4. PERFORMANCE EVALUATIONS AND
DISCUSSIONS
In the subjective listening tests, sound localization performance was evaluated according to the method of adjustment
(MOA) [7, 13]. The test signal consisted of 2 s pink noise.
The reference signal was emanated in the desired panning
direction by monophonic loudspeakers that had an angle interval of 5 degrees (◦ ) between the left loudspeaker at θ1◦
and the right loudspeaker at θ2◦ . The loudspeakers for the
multispeaker stereo were positioned as Fig. 1 and the distance from the listener to each loudspeaker was equally 1.5
m. The five experienced audio engineers participated in the
tests. They were asked to adjust the panning angle with 1
degree resolution to match the direction of reference by selfmanipulating the control buttons in PC software as shown in
Fig. 6.
The three conventional panning methods, i.e. pair-wise
VBAP and MDAP with 20◦ or 30◦ spread angle, and the
proposed panning method, MVBNAP, were compared [4, 6].
The results of subjective listening test are described in Fig.
5. The bias error presents the difference between the perceived panning angle and the desired panning angle. Fig.
5-(a) denotes the result of bias error in the symmetric layout (30◦ , 0◦ , −30◦ ), and Fig. 5-(b) denotes the result of bias
error in the symmetric layout (40◦ , 0◦ , −20◦ ). To generate
the panning angle at all of directions from θ1◦ to θ2◦ , the additional loudspeakers at 90◦ and −90◦ were positioned only
for MDAP as [6].
In the symmetric layout, the sound localization of VBAP
results in increases in the bias error in closer directions to the
loudspeaker. This bias is caused by the detent effect and is
a common problem in the pair-wise amplitude panning [5].
But in the direction to the loudspeakers, the sound localization of VBAP is best because the signal is emanated by only
one loudspeaker. MDAP results in better performances in
closer directions to the center-positioned loudspeaker. However, at the far directions from the center loudspeaker, the
localization performance remarkably decreases. It is understood that the virtual vector for the lateral direction demonstrates worse performance with respect to localization than
the virtual vector in the relative frontal direction. In addition,
1340
formance of panning algorithms is easily influenced by the
configuration of loudspeakers and the listener’s position. Additionally, the negative gain factor can affect the stability of
sound localization. In this paper, we introduced a nonnegative amplitude panning algorithm using a multiple-wise vector base formulation to compute panning gains for the multiple loudspeakers. It has the advantages of both vector base
formulation and multiple-loudspeaker panning techniques.
The proposed panning method can be improved by the optimization of nonnegative gain control function in any configuration of loudspeakers.The results of subjective listening
tests indicate that the localization performance of MVBNAP
is more robust in almost every directions. However, the proposed panning algorithm can be further improved through the
optimization of a nonnegative gain control function. A clear
and theoretical decision of control function should be verified
with a psychoacoustic theorem and subjective test results in
the future.
REFERENCES
Figure 6: User interface of the test program using the method
of adjustment(MOA).
the position of additional loudspeakers at ±90◦ can influence
sound localization. The bias error of MVBNAP is the smallest in the frontal direction between about 10◦ and −10◦ . In
the direction of the side loudspeakers, MVBNAP also shows
the detent effect as being similar to that for pair-wise VBAP.
In this test of symmetric layout, we experimentally chose the
optimal parameter of nonnegative gain control function as
ϕ = 0.82 as Fig. 3. If the curve of Eq. (7) is designed to
obtain better sound localization, it results can be improved.
In the asymmetric layout, pair-wise VBAP results the
similar effect with a symmetric layout.
MDAP also
shows better localization in closer directions to the centerpositioned loudspeaker. However, at the side directions,
the sound localization inclines toward the center direction.
The reason is that the localization performance of the virtual source of VBAP is angle-dependent. This can be verified by the test result that MDAP30 presents worse than the
MDAP20 in the lateral directions. Although MVBNAP totally results in the more robust localization than the other
panning methods, the bias error slightly increases within -2
degree at the near directions of the frontal loudspeaker. Its
bias to the right direction is caused by the proximity of the
center and right loudspeakers, when the multiple loudspeakers concurrently emanate the signal as like MDAP. In the
asymmetric layout, the parameters (ϕ ) of nonnegative gain
control function were set as 0.48 and 0.97 as Fig. 3. To
reduce the bias error with close loudspeakers, the curve of
Eq. (7) should be sharper. If more higher-order sinusoidal
function is applied into the formulation of Eq. (7), its sound
localization can be improved.
5. CONCLUSIONS
In the loudspeaker system, the point-like perception of the
virtual sound source is the best result. However, the per-
[1] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization. Cambridge, MA: MIT Press,
1997.
[2] A. D. Blumlein, U.K. patent 394,325, 1931.
[3] B. Bernfeld, ”Attempts for better understanding of
the directional stereophonic listening mechanism,” AES
44th Conv., Rotterdam, Mar. 1973.
[4] V. Pulkki, ”Virtual sound source positioning using vector base amplitude panning,” J. Audio Eng. Soc., vol.
45, no. 6, pp. 456-466, Jun. 1997.
[5] M. A. Gerzon, ”Panpot laws for multispeaker stereo,”
AES 92nd Conv., Vienna, Mar. 1992.
[6] V. Pulkki, ”Uniform spreading of amplitude panned
virtual sources,” in Proc. IEEE Workshop on Application of Signal Processing to Audio and Acoustics (WASPAA), New York, USA, Oct. 1999.
[7] V. Pulkki and T. Hirvonen, ”Localization of virtual
sources in multichannel audio reproduction,” IEEE
Trans. on Speech and Audio Processing, vol. 13, no.
1, pp. 105-119, Jan. 2005.
[8] G. Monro, ”In-phase corrections for ambisonics,” in
Proc. Int. Computer Music Conf. (ICMC), Berlin, Germany, 2000.
[9] P. G. Craven, ”Continuous surround panning for 5speaker reproduction,” AES 24th Int. Conf. on Multichannel Audio, Jun. 2003.
[10] V. Pulkki, ”Spatial sound generation and perception
by amplitude panning technique,” PhD thesis, Helsinki
University of Technology, Espoo, Finland, 2001.
[11] M. Poletti, ”Robust two-dimensional surround sound
reproduction for nonuniform loudspeaker layouts,” J.
Audio Eng. Soc., vol. 55, no. 7/8, pp. 598-610,
July/August 2007.
[12] S. Haykin, Adaptive Filter Theory, 4th ed. Upper Saddle River, NJ: Prentice-Hall, 2002.
[13] B. L. Cardozo, ”Adjusting the method of adjustment:
SD vs DL,” J. Acoust. Soc. Am., vol. 37, no. 5, pp. 768792, May 1965.
1341