Distributed, self-scaling, network

Distributed, self-scaling, network
US 20140064519A1
(19) United States
(12) Patent Application Publication (10) Pub. No.: US 2014/0064519 A1
Silfvast et al.
(43) Pub. Date:
(52) US. Cl.
(76) Inventors: Robert D. Silfvast, Belmont, CA (U S);
Raymond Tantzen, Paci?ca, CA (U S)
USPC ........................................................ .. 381/119
A distributed self-scaling network audio processing system
includes end nodes interconnected by packet-switched net
Work and operating as peers on the network. Each of the end
nodes supports local input processing, mixing, and output
processing. The input processing includes the option of dual
input channels for supporting separate front-of-house and
(21) App1.No.: 13/602,433
(22) Filed:
Mar. 6, 2014
Sep. 4, 2012
monitor Work?oWs. End nodes are added to the system to
Publication Classi?cation
Int. Cl.
H04B 1/00
70 \Q
@ J54
support speci?c audio processing applications, based on the
number of audio sources, the number of output mixes
required, and the number of locations from Which users
choose to interact With the system.
Group C
Front of House
Mix Operator
Patent Application Publication
Mar. 6, 2014 Sheet 1 0f 10
US 2014/0064519 A1
User A
User B /1g5
l/O Device
I/O Device
Central Engine
DSF‘ + Mixing
HO Device
User 0
FIG. 1
(Prior Art)
Patent Application Publication
Mar. 6, 2014 Sheet 2 0f 10
US 2014/0064519 A1
User A
End Node
DSP + Mixing
User B
End Node
DSP + Mixing
End Node
DSP + Mixing
User 0
FIG. 2
Patent Application Publication
Mar. 6, 2014 Sheet 5 0f 10
US 2014/0064519 A1
5096:50 5930$WE-3Q
wmc no
2w9a%s EQE [email protected]‘
Patent Application Publication
Mar. 6, 2014 Sheet 6 0f 10
US 2014/0064519 A1
2ME0 6
250E9:34503 p6EAz-2mQ25%:E2Q0 mmvAAHVwmuoi
Patent Application Publication
Mar. 6, 2014 Sheet 8 0f 10
US 2014/0064519 A1
A% 3\H626Emw0?.2 r\ mwoxSzom QvW5com0?lo3z“a mz
m [email protected]>w<uoUz
A. wJ,
Patent Application Publication
Mar. 6, 2014 Sheet 10 0f 10
US 2014/0064519 A1
mo m> 5/EM w5Bm:Hw
Mar. 6, 2014
US 2014/0064519 A1
ever the audio signal ?oW architecture has remained essen
tially the same, With the mixer being the center of the audio
[0001] In professional audio, a mixing console, or audio
mixer, also called a sound board, mixing desk, or mixer, is an
electronic device for combining (also called mixing), routing,
and changing the level, timbre and/or dynamics of audio
signals. A mixer can mix analog or digital signals or both,
depending on the type of mixer. The modi?ed signals (volt
ages or digital samples) are summed to produce the combined
output signals.
Mixing consoles are used in many applications,
including recording studios, public address systems, sound
reinforcement systems, broadcasting, television, and ?lm
post-production. An example of a simple application Would
be to enable the signals that originated from tWo separate
microphones (each being used by vocalists singing a duet,
perhaps) to be heard though one set of speakers simulta
neously. When used for live performances, the signal pro
duced by the mixer Will usually be sent directly to an ampli
?er, unless that particular mixer is “poWered” or it is being
connected to poWered speakers.
[0003] The output of a mixer is referred to as a mix bus or
simply a bus. As used herein, the term “mix bus” refers to an
audio signal produced by combining multiple audio source
signals in a Weighted summation operation Where typically
the individual Weights applied to each source signal are under
user control for example using the linear faders or knobs of a
mixing console). The term “mix matrix” is used to refer to an
[0005] Audio systems are not alWays built around one
single mixer. In fact, it is common practice to use multiple
mixers in a given application to perform sub-mixing. In this
model, the mixing (combining) of audio signals occurs in a
hierarchical fashion, With groups of signals being pre-mixed
in one mixer, and the result of that pre-mix being fed into
another mixer Where it is combined With other individual
signals or other pre-mixes coming from other sub-mixers. In
a live concert application, it is common practice to separate
the “front-of-house” mixing task from the “on-stage moni
toring” mixing task using tWo separate mixing consoles each
having its oWn operator. In this model, each source signal is
split into tWo feeds (often using a device called a “splitter
snake” Which performs this function for many sources); one
feeding each of the front-of-house and on-stage monitoring
mixers. The front-of-house operator creates the audience
mix, While the monitor mix operator creates mixes for the
performers on stage to hear themselves and their co-perform
ers as clearly as possible.
[0006] Despite its continued prevalence over many years,
the conventional, centraliZed mixer approach has some dis
tinct and important disadvantages. A ?rst problem With con
ventional audio mixing systems is that they do not scale in a
natural and easy Way. Most users of mixing consoles service
a Wide range of audio production applications and scenarios,
requiring anyWhere from one or tWo channels and a simple
mono mix, up to doZens or even hundreds of channels and
operation can be mathematically represented by the matrix
doZens of separate mixes. Therefore, When purchasing a
mixer, it is dif?cult to determine exactly Which siZe console to
buy. Mixing console vendors offer a very Wide range of siZes
to cover the market space, and buyers must choose something
that seems like the right ?t, hoping to avoid spending more
equation C:B*A, Where C is a vector of N output signal
money or taking up more space than they need to or, on the
operation that produces multiple mix busses from a common
group of audio source signals. At any instant in time, this
states, A is a vector of M input signal states, and B is an
other hand, hoping to avoid running out of channels or mix
M-by-N rectangular matrix of summing Weights, and the * is
busses When they have a large job. Some buyers/users Will
purchase multiple, different siZed mixers to handle different
a matrix multiplication operator. In some cases a mix, bus
might be a single discrete audio channel, While in other cases
it may include more than one audio channel having a common
association (for example, a stereo mix bus has tWo channels
left and right, and a surround mix bus has more than tWo
audio mixing systems, i.e., those that use shared, packet
based netWorks to interconnect signal input and output (I/O)
devices With signal processing devices. These systems typi
cally impose considerable latency in the audio path from
channels corresponding to the surround speaker con?gura
tion targeted by the particular mix).
In common practice, the mixing console serves as a
central “hub” in the audio system, alloWing for all of the audio
source signals in a given application to be acquired, treated,
combined into various mixes, and then re-distributed outWard
to monitoring equipment (loudspeakers and headphones) or
recording equipment (tape decks or hard disk recorders) or
broadcast feeds (satellite uplinks, Webcasts, other remote
A second problem to be solved occurs in netWorked
signal source to monitor output. This latencyitypically on
the order of 2 to 10 milliseconds4can negatively impact the
experience of, and results achieved by, a performer Who is
singing or playing an instrument While monitoring himself
through the system. The reasons for this increased latency are
tWofold: ?rst, packet-sWitched netWorks have queues and
delays Within their basic infrastructure, such that signal trans
feeds) from a central point in the system. The use of this
centraliZed architecture has been necessary in designing ana
port across the netWork takes an indeterminate amount of
log mixing consoles, because these devices employ analog
the order of l or 2 milliseconds for optimiZed netWorks such
circuitry that is physically attached to the various control
knobs, sWitches, faders (rheostats), and LED indicators. In
as those using IEEE Audio Video Bridging standards (and
higher amounts for netWorks using older technologies), that
order for a single person to operate all the controls of the
system in an ergonomically convenient manner, all of these
analog circuits needed to be located underneath, or behind, a
the receiving side must expect in order to avoid “buffer under
run” conditions that cause audio glitches. Second, conven
common physical control panel. With the penetration of digi
tal technology into mixing console design, some equipment
time; this mandates a minimum “safety bound,” typically on
tional systems locate the I/O and signal processing/mixing
functions in separate physical units; thus for a singer to hear
makers have chosen to physically separate the user interface
herself in a monitor mix, her signal must make tWo trips
across the netWork (from the I/O to the mixer and back to the
controls from the audio processing hardWare elements; hoW
I/O again). The netWork transport latency compounds With
Mar. 6, 2014
US 2014/0064519 A1
analog-to-digital and digital-to-analog conversion latency to
impose a minimum latency typically of 2 milliseconds, and
often much more, along the most critical-latency path.
[0008] The importance of minimizing latency for a self
monitoring path can be quanti?ed as folloWs: Each millisec
ond of latency imposed on an audio signal corresponds to
sound traveling through air a distance 0.34 meters (about 13
inches) at sea level. When a person sings, she hears her vocal
chords Within a fraction of a millisecond as the vibrations are
conducted through bone, body tissue, and immediate sur
rounding air to her ears. When a person plays an acoustic
guitar, he hears the sound from the guitar Within about 2
milliseconds, since he is holding the instrument no further
than about 2 feet from his head. When a group of people
perform together (or even When they have a conversation in
the same room), they are typically located a feW feet apart,
thus they hear each other a feW milliseconds later than each
person hears his or her oWn voice or instrument. We therefore
conclude that self-monitoring becomes unnatural When the
signal path from voice or instrument to ears has a latency
greater than about 2 milliseconds. HoWever, monitoring oth
ers can seem perfectly natural When the signal path latency is
5 or 10 milliseconds or even more.
A third problem With conventional audio mixing
systems, as Well as modern netWork-based mixing systems, is
that their use of a centraliZed mix engine creates an inconve
nient topology that hinders the ergonomics and increases cost
of system setup and maintenance. The central mix engine
needs to be set up, poWered, and connected With (typically)
large numbers of cables to the various devices at the extremi
ties of the system Which are located near actual users. This
results in a large number of cables crossing through the stage
or room, and a large number of potential failure points in the
A fourth problem stems from conventional systems’
lack of fault tolerance since they rely on a central mix engine
for all the audio processing. If a fault occurs in the central
mixer (such as a poWer supply failure or a main CPU crash)
then it is possible for the entire system to become inoperable.
[0011] In general, the methods, systems, and computer pro
gram products described herein provide distributed audio
processing. The architecture is based on audio processing
nodes connected With a netWork and operating as “peer
devices” in a system. Advantages of the system include the
ability of the system to scale linearly With the number of input
channels and output mixes required, reduced audio latency,
and improved end-user ergonomics. In general, in one aspect,
an audio processing unit comprises: an audio input module
for receiving one or more source audio signals; an audio
output module for outputting one or more audio mixes; a
netWork connection module con?gured to send and receive
audio signals over a netWork in substantially real-time; one or
more input channels for processing the received one or more
source audio signals, Wherein each of the received source
audio signals is processed by an assigned channel of the one
or more input channels, and Wherein each input channel
includes a channel strip comprising a chain of processing
blocks to be applied to the received source audio signal
assigned to that channel, and Wherein an output of the channel
strip is provided to the netWork connection module for trans
mission over the netWork; a digital mixer for generating one
or more output mixes by mixing the processed source audio
signals received from the one or more channel strips With
audio signals received via the netWork connection module
from outputs of one or more real-time audio devices con
nected to the netWork; and one or more output channels for
processing the one or more output mixes, Wherein each of the
one or more output mixes is processed by an assigned one of
the one or more output channels, and Wherein the audio
output module is con?gured to receive and output the pro
cessed one or more output mixes.
Various embodiments include one or more of the
folloWing features. The audio processing unit further includes
a processor for hosting a user interface, and the user interface
enables an operator to control parameters of the one or more
output mixes. The netWork connection module includes a
netWork sWitch including a port connected to the processor,
and at least tWo externally available ports for establishing
connections to a plurality of devices on the netWork, and
Wherein the netWork sWitch is con?gured to ?lter and route
packets betWeen the netWork sWitch ports enabling the net
Work sWitch to bridge betWeen at least tWo externally con
nected netWork devices and the processor of the audio pro
cessing unit. The at least tWo externally available ports
support a daisy chain connection topology. The netWork con
nection is con?gured to receive over the netWork control
commands for controlling parameters of at least one of the
one or more input channels, the digital mixer, and the one or
more output channels. The control commands are transmitted
over the netWork by a device connected to the netWork, and
the control commands are generated by interaction of an
operator of the device With a user interface of the device. The
one or more processed output mixes are provided to the
netWork connection module for transmission over the net
Work, and the operator of the device is able to listen to the one
or more processed output mix While controlling the param
eters of at least one of the input processor, digital mixer and
the output processor. A user interface for controlling the audio
processing unit is hosted by a second audio processing unit
connected to the netWork. Each of the input channels further
comprises a second channel strip for processing the one or
more received source audio signals, Wherein the output of
each of the second channel strips is provided to the digital
mixer and to the netWork connection module for transmission
over the netWork. The outputs of the ?rst-mentioned channel
strips are suitable for feeding a local monitor mix and the
outputs of the second set of channel strips are suitable for
feeding a front of house mix. An output of the digital mixer is
provided to the netWork connection module for transmission
over the netWork. An output of the output mix processor is
provided to the netWork connection module for transmission
over the network. The channel strip processing of the received
audio signals includes one or more of a rumble ?lter, equal
iZation, delay, and insert processing. The netWork connection
module is con?gured to receive pre-mixed audio signals over
the netWork, and the digital mixer is able to generate an output
mix that includes the pre-mixed audio signals. The digital
mixer is con?gured to generate one or more output mixes in
addition to the ?rst-mentioned output mix, and the audio
processing unit further comprising one or more output pro
cessors in addition to the ?rst-mentioned output processor,
Wherein each of the ?rst mentioned output mix and the one or
more additional output mixes is processed by an assigned one
of the ?rst mentioned output processor and additional one or
more output processors to generate a processed output mix
for sending to the audio output module. An analog mixer for
Mar. 6, 2014
US 2014/0064519 A1
receiving one or more of the source audio signals in analog
form and for mixing the one or more received audio signals in
analog form with one or more submixes of signals received
from the network via the network connection module,
wherein an output of the analog mixer is received for output
by the audio output module, such that an audio path latency
for the one or more signals received in analog form between
receipt by the audio input module and output by the audio
output module is less than about 50 microseconds.
[0013] In general, in another aspect, an audio processing
system comprises: a plurality of end nodes connected by a
network, wherein each of the end nodes is con?gured to send
and receive audio signals over the network in substantially
real-time, each end node including: one or more audio input
ports; one or more audio output ports; an input processing
module; a mixing module; and an output processing module
for processing a mix received from the mixing module;
wherein a ?rst end node of the plurality of end nodes is
con?gured to: receive ?rst audio signals via the one or more
audio input ports of the ?rst end node; condition the ?rst audio
nodes is connected to the network, and the one or more
additional end nodes includes at least one of a video camera,
a digital audio workstation, a mixer control panel, a mobile
controller, a video display, and media server.
[0015] In general, in a further aspect. An audio processing
system comprises: a plurality of end nodes connected by a
network, wherein each of the end nodes is con?gured to send
and receive audio signals over the network in substantially
real-time, each end node including: one or more audio input
ports; one or more audio output ports; an audio processing
module for processing audio signals received via the one or
more audio input ports; and a mixing module for mixing
audio signals; and the system is con?gured to: at a ?rst end
node of the plurality of end nodes: receive a command via a
user interface local to the ?rst end node, wherein the com
mand is one of an audio processing command and a mixing
command; and transmit the command across the network;
and at a second end node of the plurality of end nodes: receive
the command; and execute the command on the second end
signals using the input processing module of the ?rst end
node; and transmit the conditioned ?rst audio signals over the
network; and wherein a second end node of the plurality of
end nodes is con?gured to: receive the conditioned ?rst audio
signals via the network; receive additional conditioned audio
signals from one or more end nodes of the plurality of end
nodes other than the ?rst and second end nodes; mix the
conditioned ?rst audio signals and the additional conditioned
signals using the mixing module of the second end node to
generate an output mix; process the output mix using the
output processing module of the second end node; and output
the one or more rendered output mixes from the one or more
audio output ports of the second module.
Various embodiments include one or more of the
following features. Con?guring the ?rst and second end
FIG. 1 is a high level block diagram of prior art
centraliZed mixing systems.
FIG. 2 is a high level block diagram of a distributed
self-scaling network audio processing system.
[0018] FIG. 3 is a block diagram of a distributed network
audio processing system with six audio sources and seven
mix. outputs spread across four end nodes.
[0019] FIG. 4 is a high level block diagram of the input
processing, mixing, and output processing functions of an
end node of a distributed audio processing system.
[0020] FIG. 5 is a high level block diagram of an end node
of a distributed audio processing system illustrating submix
nodes to send and receive audio signals over the network in
substantially real-time corresponds to a signal transport
latency in the network that is approximately equal to an
acoustic path latency between a physical location of the ?rst
that includes a CPU for hosting a local UI.
[0022] FIG. 7 illustrates a range of end node types that may
be linked to the network as part of a distributed self-scaling
node and a physical location of the second node. The second
end node is further con?gured to: receive second audio sig
nals via the one or more audio ports of the second end node;
network audio processing system.
condition the second audio signals using the input processing
tributed audio processing system
module of the second end node; and include the conditioned
second audio signals as one or more inputs to the mixing
module to generate an output mix that includes the condi
tioned second audio signals. One or more of the plurality of
end nodes each includes a second input processing module,
and wherein, for each of the one or more of the plurality of end
nodes that include a second input processing module: the ?rst
mentioned input processing module is con?gured to condi
FIG. 6 is a high level block diagram of an end node
FIG. 8 is a diagrammatic screen shot of a home
screen of an illustrative user interface for controlling a dis
[0024] FIG. 9 is a diagrammatic screen shot of a source
control screen of an illustrative user interface for controlling
a distributed audio processing system.
[0025] FIG. 10 is a diagrammatic screen shot of a mix
control screen of an illustrative user interface for controlling
a distributed audio processing system.
tion the audio signals received, from the one or more audio
input ports of that end node for a front of house mix; and the
second input processing module is con?gured to condition
[0026] The methods and systems described herein enable a
real-time audio processing system that uses a distributed
architecture that improves scalability, critical-path audio
the audio signals received from the one or more audio input
ports of that end node for a monitor mix local to that end node.
Conditioning the ?rst audio signals includes at least one of
latency, and end-user ergonomics. The new system architec
ture is referred to as “distributed, self-scaling, network” (or
rumble ?ltering, equalization, delaying, and insert process
DSSN) mixer architecture.
ing. Rendering the output mix includes at least one of adding
reverb effects and equalization, and the rendering is adapted
such as gigabit Ethernet and IEEE Audio Video Bridging
[0027] The availability of modern networking technology
tional end nodes in addition to the ?rst-mentioned plurality of
standards, and the ability of this technology to carry large
numbers of real-time audio signals between different pieces
of digital audio equipment with very low latency, has allowed
end nodes, wherein each of the one or more additional end
for a new approach to designing an audio mixing system. The
to an output environment associated with the second end
node. The audio processing comprising one or more addi
US 2014/0064519 A1
new approach creates a truly distributed and peer-to-peer
architecture, rather than centralized system architecture, for
acquiring, treating, combining into various mixes, and then
re-distributing the multiple audio signals in a sound mixing
Mar. 6, 2014
[0031] Another signi?cant advantage of DSSN architecture
is its fundamental and signi?cant reduction of critical-path
audio latency Within a netWork-based mixing system. DSSN
architecture achieves this by omitting trips across the netWork
for critical path signals. For self-monitoring, Which is by far
application. The distributed nature of this neW architecture
enables compelling solutions to common problems found in
the most critical path requiring loW latency, the signal path
conventional mixing console systems, including the scalabil
never traverses the netWork. Thus, instead of tWo netWork
ity, latency, ergonomics, and reliability problems described
delays Which occur in conventional netWork mixing systems,
DSSN architecture-based systems provide fully featured,
self-monitoring audio paths having Zero netWork delays. The
[0028] The approach described herein features a self-scal
ing architecture, based on the Way devices aggregate to groW
the mixer siZe. Aggregation involves combining tWo or more
physically separate mixers to achieve a larger overall mixer
that is treated as one. Conventional mixers aggregate to
expand input channels, but typically the mix busses simply
cascade from one unit to the next, not increasing in number.
Speci?cally, the mix bus output signals from one mixer are
simply summed With unity gain into the mix bus ses of the next
mixer doWnstream in the signal ?oW chain. While expansion
is achieved in the number of sources that can be mixed, no
additional mixes (i.e., separate, ?nal output signals) are cre
ated. We refer to this as “one-dimensional scaling.” With the
DSSN architecture, both input channels and mix busses scale
in number as more physical units are added to the system. We
refer to this as “tWo-dimensional scaling.” The bene?t of
tWo-dimensional scaling becomes especially compelling in
situations Where each performer desires his oWn custom
monitor mixia practice that is commonplace in high-end
applications such as professional concerts, and is becoming
more Widely expected in loWer-end applications such as
rehearsals, small-scale concerts, churches, and corporate
audio/video applications. DSSN architecture treats each per
former (or localiZed group of performers, such as a horn
section or a trio of background singers), as an endpoint of the
distributed system, With each endpoint having both source
signals and the need for unique output mixes. DSSN archi
tecture also treats the audience itself4or even multiple audi
ences in separate locations, or multiple Zones in a large
venueias separate endpoints in the system requiring unique
output mixes, and in some cases having source signals to
contribute into the system (such as audience microphones to
pick up audience sounds and room ambience).
[0029] A key to enabling tWo-dimensional scaling is the
architecture exploits the theory that the talent’s vocal chords
or instrument are generally located near her ears; thus, the
source signal can be mixed locally With other source signals
contributed both locally and from other end nodes on the
netWork, and the resulting mix outputted directly to the talent
by the same local unit. Furthermore, the signal path for moni
toring source signals inputted at other end nodes on the net
Work incurs just one trip across the netWork (compared to tWo
trips for a conventional system With centraliZed mix engine).
[0032] The DSSN netWork described herein enables end
nodes connected to the netWork to exchange audio signals in
substantially real time. As used herein, substantially real-time
means that a delay betWeen the input of an audio signal and
the output is delayed by no more than the amount of time for
sound to travel across a moderate siZed room or space. An
upper bound to such a delay is in the range of 30 to 50
milliseconds, Which contrasts With the typical latency of
audio signals traveling across Wide-area or cellular netWorks,
Which often exceed 100 milliseconds.
[0033] FIGS. 1 and 2 illustrate fundamental differences in
the architecture betWeen centraliZed prior system, and DSSN
architecture-based systems, and shoW hoW the architecture
reduces latency. FIG. 1 illustrates audio mixing system 100
having a centraliZed architecture, With central engine 102
performing the signal processing and mixing connected in
point-to-point fashion to input/output devices 104, 106, 108
located next to Users A, B, and C respectively. In such a
system, for User A to monitor himself requires tWo netWork
trips: the ?rst to transmit the audio to central engine 102, and
the second for the central engine to transmit the monitor mix
back to I/O device 104 collocated With UserA. For UserA to
monitor another user, e. g., User B, tWo trips are also required:
the ?rst for User B’s l/O device 106 to transmit to the central
engine, and the second for the central engine to transmit
netWork’ s ability to deliver a large number of source channels
onWard to User A. Thus all monitoring results in a minimum
to each and every end node (separate physical device) so that
monitoring delay of tWo netWork delays plus the delays asso
ciated With conversion of the signal from analog to digital at
the input, and digital to analog at the output.
[0034] By contrast, in DSSN-based system 200 shoWn in
FIG. 2, each ofthe end nodes (202, 204, 206), includes local
signal processing and mixing capability. The end nodes are
all end nodes can create independent mixes across their local
set of busses. A DSSN-based mixing system achieves this by
having a number of end nodes connected by a network, With
each of the end nodes having its oWn l/O, input channel
processing, mixing, and output channel processing capabil
[0030] With DSSN architecture, end nodes are simply
added to the system according to I/O requirements for a given
application. The facilities for input processing, mixing, and
output processing are built into each of the end nodes, such
that all l/O points have the processing they need to service the
performers or operators interacting With the various end
nodes. Unlike Working With predetermined mixer siZe
options, a user of a DSSN-based system can incrementally
add processing channels or mix busses to match the amount of
audio l/O facilities required for the application at hand. This
connected by netWork 208, such as a Gigabit Ethernet net
Work. When a user monitors himself, there is no netWork
usage because the entire path from his source signal input to
his monitor signal output is contained Within end node 202.
Furthermore, because this audio path excludes the netWork (a
digital medium), the monitoring could be done entirely in the
analog domain, thus also avoiding the latency associated With
analog to digital conversion and digital to analog conversion,
resulting in Zero latency monitoring. If some level of delay is
desired for time aligning source signals With signal paths used
is easily accomplished by plugging additional end nodes into
to monitor other end nodes, an arti?cial delay can be inserted
in the local end node’s DSP path. This contrasts With a net
the netWork.
Work-based mixer using a central engine, in Which the delays
Mar. 6, 2014
US 2014/0064519 A1
imposed by the network cannot be eliminated. In DSSN
mixer architecture, monitoring of others on the netWork is
also improved signi?cantly compared to a conventional, cen
traliZed system. These paths incur just a single trip across the
netWork instead of tWo. For example, When User A monitors
User B, there is a single netWork trip of the audio from end
chooses (not shoWn). It Would also be common for a front
of-house mix operator to have a separate “cue mix” output
(not shoWn), typically feeding headphones, that he can use to
audition different mixes, hear the talkback communications
among users, or monitor other signals that are not fed to
node 204 to end node 202.
audience loudspeakers. Aside from the nature of their respec
tive users, each of end nodes 302, 304, 306, and 308 include
fundamentally the same basic capabilities. Thus, a common
FIG. 3 provides a high level diagram of illustrative
DSSN architecture-based system 300 With six audio sources
and seven mix outputs spread across four end nodes. End
nodes 302, 304, 306, and 308 are each connected to netWork
310 via Wired or Wireless links 312, 314, 316, and 318 respec
tively. DSSN mixer end nodes and any end nodes connected
to the netWork communicate directly With each other over
netWork 3 10, obviating the need for a central, intermediary or
end node architecture is capable of supporting both on-stage
performers and the “house mix” in a live concert application.
Also illustrated is separate mobile controller 346, connected
to the netWork via Wireless link 348, enabling control com
mands to be entered using a mobile device that does not need
to include audio processing capability.
[0039] FIG. 3 illustrates the self-scaling feature of DSSN
master, device to manage and/ or direct communications
architecture in that both the number of sources and the num
among the different devices.
[0036] As illustrated in FIG. 3, an end node serving a per
former, such as end node 302 serving User A. includes an
audio capture device, such as microphone 320 connected via
ber of mix outputs supported by the overall system scale up or
pathWay 322 via an audio input module (not shoWn) to end
node 302. The end node performs input processing at input
processing module 324, and passes the processed signal (A')
to local mixer 326 and netWork interface 328. At local mixer
326, the processed signal from User A may be mixed With
input-processed audio signal inputs from other end nodes (B',
C', D', E', F‘) received via netWork interface 328. A local
monitor mix is output from local mixer 326, undergoes output
processing at output processing module 330 before being sent
doWn as end nodes are added or removed. For example, if
User A and User B Want to rehearse together, then the end
nodes serving User Group C and the Front of House Mix
Operator may be omitted, and the system scales doWn to tWo
inputs and tWo mix outputs With no loss of useful function
ality to User A or User B. In another example, loudspeakers
338 and 340 may be implemented as separate DSSN mixer
end nodes, each producing a single mix output to drive its
local ampli?er and transducer elements, Which Would obviate
the need for end node 308. In this case the large mixer control
panel 342 could communicate With designated end nodes,
such as those embedded in loudspeakers, over the netWork
via an audio output module (not shoWn) via pathWay 332 to an
audio output device, such as headphones 334. End node 302
may include monitor control panel 336 for providing a user
interface to UserA, from Which control commands are passed
using optional netWork link 350. In this scenario, mixer con
trol panel 342 and mobile controller 346 are essentially
to one or more of the processing modules in end node 302 or
transmitted over netWork 310 via netWork interface 328 to
large number of loudspeaker units, for example to cover a
large concert venue, a con?guration having a DSSN mixer
control functions Within other end nodes as needed, depend
ing on the application at hand. Monitor control panel 33 6 may
be external to the main chassis of end node 302 (as illustrated
in FIG. 3), or embedded Within the chassis, depending on the
form factor and usage model of the particular end node. In
end node inside each loudspeaker alloWs each loudspeaker to
generate its oWn unique mix based on its location, acoustical
environment, and proximity to certain listeners. The self
many or too feW mix busses as loudspeakers are added or
some cases this user interface may be implemented on a
removed from the system.
equivalent from a system point of vieW, differing only in their
form factor and user interface. For an application requiring a
scaling property ensures that the system does not have too
single touch screen, While in other cases it may be imple
mented With dedicated knobs, sWitches, LEDs, character dis
plays, and the like.
[0037] As illustrated in FIG. 3, each of the four end nodes
ing system provides clear bene?ts With regard to fault toler
The distributed nature of a DSSN-architecture mix
ance. Speci?cally, if a given end node has a failure, the
remaining end nodes continue to operate Without loss of any
has its oWn local mixer, and each of these local mixers is fed
functionality except the audio sources feeding the inputs of
by all six conditioned/enhanced source signals A' through F‘.
Thus each end node, using its local mixer, is capable of
producing independent mixes for local delivery to its user or
users for listening. In addition, the self-monitoring path(s) for
each end node can be optimiZed for loWest possible latency
the failed end node. The only mixes that are lost are those
produced by the failed end node. With a centraliZed mixer
architecture, it is possible to lose every mix output, or even
every audio path in the system if the central mixer fails.
since the signal chain from audio source to input process, to
[0041] FIG. 3 also illustrates the distributed nature of con
trol in a DSSN mixer system. Front-of-house mix operator
local mixer, to output process, to monitor output signal, is
controls the system using mixer control panel 342, Which
contained Within the local end node and does not traverse the
[0038] End node 308 serves a front of house mix operator as
Well as the audience to Which he delivers the main “house
sends and receives control commands to and from end node
308 either via direct connection 352, or via netWork link 350.
A separate operator may control the system from remote
mix” via loudspeakers 338 and 340. In the illustrated
example, the node includes large mixer control panel 342 and
loudspeakers 338, 340. Talkback microphone 314 delivers
audio source signal P into the system, alloWing the front-of
house mix operator to relay verbal information into the head
phones of Users A, B, and C via F‘ While hearing his oWn
voice With optimally loW latency in loudspeakers if he so
locations by operating mobile controller 346 connected to
netWork 310 via Wireless link 348. UserA controls the system
using monitor control panel 336 of local end node 302. User
Group C has no local control panel and instead these perform
ers rely on other users or operators to control the parameters
of their input process, local mixer, and output process by
sending and receiving control commands from one or more
remote nodes on the netWork. Each parameter or function
Mar. 6, 2014
US 2014/0064519 A1
Within the overall system may be individually addressable,
allowing for arbitrary mappings between the point of control
the physical distance betWeen the tWo nodes (or more appro
priately, betWeen the tWo users located near these respective
and the function to be controlled. For example, UserA might
nodes). The receiving node Will use the presentation time
choose to control the monitor mix for a ?rst singer in User
parameter (or a similar mechanism used to encode time
Group C, While a second singer in User Group C chooses to
have his mix controlled by the front-of-house mix operator,
and a third singer in User Group C chooses to have his
delay) to delay the audio before injecting it into the local
mixer of the receiving node. Alternatively the sending node
monitor mix controlled by a roaming engineer operating
may add extra delay to a signal destined for a particular other
end node in the system as an operation Within its input chan
mobile controller 346. The combination of the four DSSN
mixer end nodes, the various controllers connected directly or
nel strip processing (described beloW) before transmitting the
indirectly to the netWork, and the “all-to-all” connectivity
[0046] Referring to FIG. 4, We noW describe an embodi
ment of an end node 400 in a mixing system based on DSSN
architecture. Audio input module 402 includes one or more
among all devices, as illustrated in FIG. 3, serves as an overall
audio mixing system that operates as a Whole While being
distributed among multiple physical units located optimally
near the various and respective users of the system.
[0042] A netWork topology, as illustrated in FIG. 3, is con
trasted With point-to-point interconnection topology, in
Which devices use multiple, dedicated links to communicate
With each other, each link supporting communication, either
unidirectional or bidirectional, betWeen exactly tWo devices.
If the system illustrated in FIG. 2 used point-to-point links
instead of a netWork, it Would require each end node to have
tWo separate transmit links and tWo separate receive links, to
communicate With all the devices in the system. If this system
Were scaled up to ten end nodes, each Would require nine
separate links in each direction, for a total of 90 links in the
system. Thus, the netWork connection topology improves
tremendously upon point-to-point topologies, making system
audio signal onto the netWork.
audio input ports for receiving source audio signals, for
example from microphones and instruments. Not all the
sources may be active at any given time. For example, an
external source selector sWitch may enable sWitching from
microphone input to a line-level input fed by a different
source. End nodes may have 1, 2, 4, 8, 16, or other number of
audio input ports. The audio ports may be of different types,
such as mic, line, digital, With various corresponding connec
tor formats for all of these. The need for different types of port
may be obviated by source selector sWitching upstream of the
input processing channels.
Each of the received audio signals is fed to a desig
nated input channel of one or more input channels 404, 406 of
end node 400. Each of the end nodes on the netWork that
services at least one input includes one or more input chan
setup and connection much easier. DSSN mixer architecture
depends on a netWork connection topology to achieve scal
nels, With the total number required being at least equal to the
ability Without complicating the interconnect problem.
trated in FIG. 4, N audio source inputs are routed to a different
number of active audio source inputs. In the example illus
In practice, computer netWorks utiliZe sWitches and
one of the available input channels. In each input channel
routers to facilitate communications betWeen end nodes;
hoWever these devices do not participate in end-node conver
domain (via analog front end 408), Which may in some
sations; they merely serve to provide multiple access points to
instances include a preampli?er and in other cases only a
the common netWork by providing a suf?cient quantity of
netWork ports for end nodes to plug into. NetWork sWitches
and routers may also ?lter and direct netWork tra?ic betWeen
ports to improve netWork e?iciency, once they learn the
addresses of the devices connected to each port.
[0044] In some applications, it may be not be desirable to
require a separate, netWork sWitch or router unit in a DSSN
(e.g., channel 404), the audio input is processed in the analog
simple buffer stage. The input is then converted into digital
form by an analog-to-digital converter (410), and then fed
into one or more channel strips 412, 414. In some cases the
input signal may already be in digital format prior to entering
the end node, obviating the need for analog front end 408 or
analog-to-digital converter 410. A channel strip includes a
chain of processing blocks that are applied to a given input
mixing system. Reasons for this may include saving system
signal in a substantially sequential order. This processing
cost, or simply the lack of availability of such a unit to some
generally serves tWo distinct purposes: to condition or “clean
users. To accommodate this case, some embodiments of
up” the source signal to make it suitable for doWnstream
DSSN-architecture devices may include a built-in netWork
sWitch having at least tWo ports available for users to connect
to other netWork nodes in the system. With this feature, a
DSSN end node can facilitate a “daisy chain” connection
alloWing it to be inserted betWeen any tWo devices on the
processing and mixing With other signals; and to enhance4or
deliberately modifyithe sonic character of the audio source,
netWork and maintain communication paths betWeen any
combination of itself and the other tWo devices. It is possible
to make it more pleasing in the context of the overall mix or
output signal delivered to an audience or user. This distinction
can sometimes be subtle, and in many cases there is overlap
betWeen conditioning and enhancement, especially because
each end node can pass messages on to the next one in the
the intent of both is to improve the sound quality of signals.
HoWever, We make this distinction to highlight the impor
tance of the location of certain audio processing functions
Within the overall system architecture. As Will be apparent in
chain in either direction such that all devices in the chain can
the descriptions and diagrams that folloW, this placement
communicate on the same netWork.
choice has a very large impact on the scalability and func
tional poWer of an overall mixing system built from distrib
to connect a large number of end nodes this Way Without
requiring a separate netWork sWitch or router device, since
[0045] DSSN architecture permits delays to be speci?cally
programmed in order to mimic acoustic path latency of group
uted components.
performance. This may help members of a performing group
perceive each other in a manner that more closely simulates
an acoustic environment. For example, a particular stream
that carries audio data. from one end node to another can be
of signal conditioners and signal enhancers. Signal condition
programmed With a “presentation time” commensurate With
For completeness We noW describe some examples
ers include, but are not limited to: a high pass ?lter (also
knoWn as a loW-cut ?lter or rumble ?lter); dynamics process
ing, Which may include an expander, a gate, a compressor, a
US 2014/0064519 A1
limiter, or a multi-band dynamics processor; an equalizer,
such as a parametric type With multiple bands having gain,
frequency, and bandwidth parameters, and shelving equaliZa
tion; delay used for time alignment; a de-esser (to remove
sibilance from a signal); and an adaptive feedback eliminator.
Examples of signal enhancers include, but are not limited to,
non-linear processes that change a signal’s harmonic struc
ture, such as tube simulation, magnetic tape simulation,
speaker cone simulation, and various other algorithms Which
Mar. 6, 2014
cessing stages in the digital domain, to provide a monitor mix
that includes non-delayed audio source signals delivered by
analog front-end 408 mixed by analog mixer 420 With sub
mixes containing all other signals of interest, produced by
digital mixer 418, processed by output channels 420, 422, and
converted back to analog by D/A converter 426 and delivered
to the analog mixer along path 422. In actual systems, it is
expected that the latency in a “Zero latency” analog only
are often described by subjective terms such as “Warmt ,”
pathWay is not more than about 50 microseconds, With the
latency de?ned as the time betWeen receipt of an audio signal
“brilliance,” “luster” and the like; processes that utiliZe split
ting, phase shifting and re-combining, such as chorus or
at audio input port 402 and output of the analog-mixed signal
from audio output port 434. This Zero-latency analog path can
?inger effects; pitch correction or modi?cation (for example,
the popular “auto-tune” algorithm), and instrument replace
be supplemented With reverberation to enhance the audio
ment (sometimes knoWn as re-voicing). It is also common
cases performers desire reverberation on their oWn voice or
instrument in their monitor mix to achieve a more ambient
practice to use equalizers and dynamics processors (as
described above) for signal enhancement purposes.
Other functions that may be included in an Input
source signals feeding into the analog mixer, and in many
and natural sound. Some amount of signal conditioning and/
or enhancement, such as EQ or Dynamics, may be imple
ing alloWs a user to visually monitor the amplitude of an audio
mented by analog front end 408 to better prepare the signal for
injecting into analog mixer 420. Reverberation may be
signal. The inclusion of metering in an Input Channel strip
applied along the digital signal path, by monitor channel strip
helps a mix operator to monitor all his audio sources and
make sure he is receiving the proper level that he expects,
upstream of the mixing function. In some cases a channel
strip’s functionality Will alloW a user to position a meter at
414, producing a reverb-enhanced version of the source sig
nal Which is fed into digital mixer 418 Where it is available to
be submixed With other sources and then combined in the
analog domain back into the ?nal monitor mix. Because
reverberation is a process that fundamentally relies upon
Channel Strip include metering, routing, and panning. Meter
various points in the strip signal chain, or it may inc hide
multiple meters acting simultaneously along the signal chain.
delaying the source signal, the delay produced by the digital
Routing functions enable a user to change the order of signal
path does not hinder the Zero-latency monitoring effect. In
processing blocks, select tap points in the chain signal chain
other Words, there is no such thing as “Zero-latency reverb.”
for feeding certain mix busses or other outputs, and to assign
or de-assign channel outputs to mix busses or other non
[0053] NetWork module 416 connects end node 400 to a
netWork that connects it to other end nodes as Well to other
mixed destinations (sometimes called “direct feeds”) in the
system. Panning is the positioning of a source, typically
based audio mixing system. In the described embodiment, the
Within a stereo or surround-sound mix, to a desired location.
netWork is a packet-sWitched netWork, such as Ethernet, con
For example, in a stereo mix a source may be panned to the
nected to netWork module 416 via one or more standard
left, to the center, to the right, or anyWhere in betWeen. In a
Ethernet j acks or equivalent, and/ or via a Wireless connection.
[0054] The one or more versions of the N processed chan
nels are fed to digital mixer 418. In the described embodiment
stereo surround mix, signals may be panned left versus right,
front versus back, and also set to desired level of intensity in
the subWoofer channel.
[0050] In embodiments having a single channel strip per
input channel, only a single conditioned and/or enhanced
input signal is produced. This is used for all mixes, including
the local monitor mix and front-of-house mixes. The embodi
ment illustrated in FIG. 4 includes tWo channel strips 412, 414
for each input channel. The digitiZed audio signals are fed to
each of the strips in parallel. The ?rst strip is primarily used to
support a front-of-house mixing Work?oW, While the second
strip is used to support a separate monitor mixing Work?oW,
Tice front-of-house version is fed to netWork access module
416, and is made available on the netWork (e.g., FIG. 3, 308),
that connects the various components that comprise the mix
ing system. in some embodiments, the output of the monitor
channel strip is also made available on the netWork (not
shoWn in FIG. 4).
Various embodiments also include one or more
additional channel strips per channel. For example, an aux/
bonus channel strip provides a third variant of the audio
inputs for various uses, such as a click track generated from a
kick drum source Which can be useful for musicians to moni
tor song tempo clearly While performing.
[0052] Some embodiments may include local analog mixer
420 to support a Zero-latency monitoring path from audio
source input to analog monitor output. This path subverts A/D
converter 410 and D/A converter 426, as Well as all the pro
devices that may be included Within the DSSN architecture
digital mixer 418 is a digital matrix mixer capable of
Weighted mix summing. In addition to receiving the local
source channels, the mixer receives processed channels over
the netWork. These channels may be made available over the
netWork from other end nodes, or from other devices on the
netWork, as described beloW. In addition, local mixes from
other end nodes may be available on the netWork and input
digital mixer 418, Which We also refer to as the “local mixer”
to indicate that it is collocated With a performer providing
audio input to node 400.
[0055] The system alloWs for a local mix generated on one
end node to be monitored remotely, i.e., by a user located at a
different end node. This Would be common practice in appli
cations Where a dedicated “monitor mix engineer” controls
the mixes outputted to individual performers, and needs to
hear each of those mixes While he is adjusting them. In this
application the latency imposed by transporting a local mix
from one end node across the netWork to the monitor engi
neer’ s end node is inconsequential because the monitor engi
neer is only monitoring others and not monitoring himself. It
is generally true that the monitor engineer could replicate the
same remote mix that he Wishes to monitor, using the local
mixer of his local end node con?gured to replicate the mix
parameters as they are set in the remote node; hoWever the
engineer Would likely prefer to directly monitor the exact
signal that is being generated Within the remote end node, so
Mar. 6, 2014
US 2014/0064519 A1
that he can be sure he is hearing exactly What the remote user
is hearing. This is sometimes referred to as “con?dence moni
Natural self-scaling is achieved because end nodes are added
as needed to accommodate the number of users (this number
generally scales With the number of audio sources in the
The mixer output(s) are fed to one or more output
system), and each node brings its oWn supply of input channel
channels 424, 426. The number of output channels provided
generally corresponds to the number of different mixes to be
processing resources to add to the overall system. The self
output by the end node. For example, if multiple performers
siZed mix matrix at each end node, as Well as the ability to
receive a large number of individual source signals from the
netWork. This aspect needs to scale up in a DSSN-architec
are sharing a given end node, and each desires a custom mix,
the number of output channels needs to be at least as large as
the number of performers sharing the end node. Different
output mixes may also be required to drive different audio
output devices, such as headphones or loudspeakers. Further
more, When an output is con?gured as stereo, the correspond
ing mix comprises tWo discrete channels, left and right. It is
common for such a tWo-channel stereo mix to be referred to in
singular sense (i.e. “a mix”), because it feeds a singular des
scaling feature of DSSN architecture requires an adequately
ture system With increasing siZe of the mixing application.
Speci?cally, both the mix matrix and the netWork connection
need to support enough inputs to accommodate the maximum
number of sources that any given mix Will need to inc hide.
With gigabit Ethernet, it is straightforWard to carry 200 or
more linear PCM-encoded audio signals on one link, and With
modern processing devices such as FPGA-based mix matrix
computational units, DSPs, or general purpose CPUs
designed to compute matrix sums ef?ciently, hundreds of
tination such as a pair of headphones Worn by a single user.
The same principles apply to surround mixes, Which com
prise more than tWo channels, for example a 5.1-channel
surround mix has six discrete channels and a 7.1-channel
trast, the signal processing operations involved in condition
signals can be mixed readily Without undue expense. In con
surround mix comprises 8 discrete channels. Similarly, the
ing and enhancing signals typically involve operations that
processing channels that are used to apply enhancements to
stereo or surround mixes may be referred to in the singular
sense; for example a “stereo output channel” actually com
netWork transmission/reception. As one example, a dynamics
prises tWo discrete paths, left and right.
Each of the output channels includes its oWn output
multi-band compressor, employs amplitude detection,
dynamic gain lookup and computation, and the application of
channel strip 428, D/A converter 430, and “analog back end”
432. The primary purpose of output channel processing is to
adapt the outgoing local mix to the environment in Which the
mix is to be heard, the output device (e.g., headphones, loud
nonlinear processes that simulate phenomena such as tube
speaker, or line-out for onWard transmission), as Well as for
the speci?c requirements of individual performers or other
users of the system such as mixing engineers. In general, the
are much more complex and burdensome than mixing or
processor such as a compressor, limiter, expander, gate, or
time constants Within these operations. Such operations do
not map naturally onto the computational units found in
today’s FPGAs, DSPs, and CPUs. As a second example,
ampli?cation and saturation, or magnetic tape saturation,
typically involve complex operations such as lookup tables,
hysteresis loops, polynomial or spline computation, or adap
output channel strip comprises various signal processing
tive equaliZation. As a third example, reverberation effects
functions arranged in sequential order, in similar fashion to
the input channel strip described previously, but con?gured to
suit output channel processing purposes described above
operations, and more. Thus, as distributed mixing system that
applies conditioning or enhancement processing on the
involve large memory buffers, ?lters, randomiZed summing
rather than to condition or enhance audio source signals.
receiving end of the netWork is greatly disadvantaged in its
Accordingly, an output channel strip might include some or
all of the conditioners and enhancer functions described pre
processing resources as sources are added to the system.
viously in the context of the input channel strip.
[0058] From the output channel strip 428, the processed
signal is fed to D/A converter 430, analog back end 432, and
on to audio output module 434. The audio output module
includes connectors suitable for various audio output devices,
such as loudspeakers, headphones, and also line out signal. In
addition the signal from the output channel strip inlay be
delivered to the netWork via netWork connection module 416,
thus making the local mixes available to other devices and
users on the netWork.
In DSSN architecture, source audio signals are both
ability to scale, because the receiving node Would run out of
[0060] FIG. 5 illustrates the splitting up of output process
ing in end node 500 into submix processing and actual output
processing. S different mixes 502 from mixer 504 are output
to S submix processing channels 506, each of the submix
outputs being sent to a corresponding one of the submix
channels. The submix channels may all be implemented on
one or more processing devices, Which might include DSP or
FPGA or general-purpose CPU type devices, The submix
channel processing blocks may include various effects, such
as reverb, EQ, echo, and delay. After processing, the S outputs
508 of the submix channel processors are fed back as inputs to
received and processed (i.e, conditioned and/or enhanced,
and thereby prepared for mixing) Within the same end node,
the mixer, and are available for mixing into output channels
for processing in the output channels. Mixer 504 outputs a
before being transmitted onto the netWork. Consequently, an
total of M mixes, of Which M-S (510) are directed to a corre
end node receiving these pre-processed audio source signals
from the netWork is able to inject them directly into its local
mixer, Without needing to thriller process these signals
betWeen reception and mixing. This aspect contributes to the
self-scaling property of DSSN architecture because the pro
cessing-intensive “heavy lifting” of conditioning and/or
enhancing operations does not need to be performed at the
end node on the receiving end of the netWork. Instead, the
burden of conditioning and enhancing is kept at the transmit
ting end node, Where the source signals ?rst enter the system.
sponding number of output channels.
[0061] Submixes serve to improve the local output mix on
a particular end node in a number of Ways. One Way is to
enable sound engineers to “divide and conquer” the mixing
task using a hierarchical grouping. Similar signal sources are
assigned to their oWn group mixes, Which are then fed into the
main mix, thereby reducing the number of separate sources
that contribute to the main mix. As an example, ten drum
microphone inputs may be mixed into a stereo pair of “master
drum” signals feeding the main mix Other examples include
Mar. 6, 2014
US 2014/0064519 A1
horn sections or background singers. Another Way in Which
submixes may improve the local output mix is to allow effects
to be applied to a group of channels using a single instance of
As indicated above, in addition to the individual
conditioned audio sources local to each of the nodes, various
local mixes are made available on the netWork. This enables
an effects processor. The output of the effects processor is
a user at a ?rst end node to listen to and adjust a mix delivered
then assigned to the main mix, and treated as just another
to the netWork by a second end node having audio source
channel (often called a “reverb return” or “effects return”
depending on the usage). This is much more ef?cient than
having separate instances of the same effects processor run
input. The processing poWer to perform this adjustment may
ning on input channel strips upstream of the mixer. It also
be performed by the ?rst end node, i.e., local to the user
making the adjustment, or may not involve anything more
than the user interface hosted by the ?rst end node and instead
provides a reasonable model of actual reverberation, as mul
using processors on the second end node or on another device
tiple sound sources stimulate the air of a common acoustical
on the netWork. Alternatively, processing may be performed
space, and the resulting echoes and re?ections are summed
partially on the ?rst end node and partially on the second end
With the direct sound at the listener’s ears.
node. For example, input processing may performed on the
[0062] A further advantage provided by netWork-based,
second end node local to the audio source and mixing and
distributed systems is the ability to control the signal process
output processing may be performed on the second end node
ing, mixing, and routing operations remotely from user inter
local to a remote user.
faces connected on the netWork, Which means that these “sig
nal operations” no longer need to be carried out in the same
[0066] End nodes may exclude audio inputs and only pro
vide audio outputs. Conversely, end nodes may exclude audio
locations Where the sound engineer, technicians, or perform
ers may be controlling the system.
[0063] DSSN architecture-based systems enable the user
interface for any device or function in the system to be hosted
locally, remotely or in multiple locations on the netWork
simultaneously. FIG. 6 is a high level block diagram of an end
outputs and only provide audio inputs. Such “unidirectional”
node 600 shoWing control pathWays Within the node, and
omitting audio signal pathWays. Node 600 includes CPU 602
for controlling the various components of the node, and for
end nodes may be included in an overall DSSN mixing sys
tem Without impairing the scalability of the system or the
bene?ts of DSSN architecture; hoWever it is recogniZed that
such end nodes do not include the loW-latency self monitoring
feature, simply because this feature requires both inputs and
outputs on the local device. An example of an output-only end
node Would be a netWork-connected loudspeaker. By having
an internal (local) mixer and output processing chain, this
hosting a local user interface for the node. The CPU
device can create its oWn custom mix for direct outputting to
exchanges control commands and data With input/output 604,
the device’ s ampli?er and transducer elements. In this Way, an
Which may include a control panel With a built-in display, or
a separate display With keyboard, mouse, or other devices for
receiving user input such as Wireless interfaces, e. g., for Wi
array of loudspeakers con?gured for 7.1 channel surround
Fi devices such tablets and smartphones. CPU 602 is also in
data communication With netWork access 606. Network
access 606 includes a multi-port sWitch capable of passing
both control and audio (and video) tra?ic betWeen external
ports and the local end node. CPU 602 may host an interface
for controlling remote other end nodes over the netWork, or
receive control commands from UI’ s running on other nodes.
sound playback could comprise a set of 8 end nodes, one
inside each loudspeaker unit and producing the discrete out
put channel corresponding to that unit’s location in the array.
By contrast, in a conventional system, the mixing operations
are performed in a centraliZed mixer, Which then feeds 8
separate output signals to the loudspeakers, each of Which is
con?gured to receive one signal and deliver that signal to its
acoustical output. An example of an input-only end node is a
Host CPU 602 may also issue commands for con?guring the
netWork-connected microphone. Another example is a play
back device for delivering pre-recorded audio into the system.
audio processing distributed system. The ?gure also illus
hi this case, the audio source signals are produced by the
trates control pathWays from CPU 602 to input channel pro
audio storage medium rather than physical input connectors.
cessing 608, submix processing 610, and output channel pro
cessing 612.
Such a device has no person performing Who needs to monitor
her performance, and thus has no need for a local audio
output. HoWever such a device may still need its audio chan
Such remote user interfaces are capable of offering
fall control of any given device. For example, the input pro
cessing, mixing, and output processing may all be control
nels processed before they are presented onto the netWork for
subsequent mixing and monitoring.
lable by one or more remote users. Different users may be
given speci?c permissions to access different functions
based systems have been described With respect to the pro
cessing of audio signals. In addition, such systems are able to
Within the system, or even different functions Within a given
In the foregoing discussion, the DSSN architecture
ci?c functions in the system or a given end node. For example,
UserA in FIG. 1 may be able to control his local mixer but not
support video functionality, particularly because the netWork
that interconnects the end nodes is ?lly capable of transport
ing both audio and video signals in real-time, and also
the parameters of his input processing channel. Alternatively,
because modem VLSI integrated circuits such as multimedia
referencing FIG. 5, the local user of the illustrated end node
system-on-chip devices designed to manage and process both
audio and video signals, have become inexpensive and
readily available. The result is that video functionality, Which
end node, While being disalloWed from accessing other spe
might be given permission to control his monitor channel
strip (514) but not his front-of-house input channel strip
(512), because all control of the front-of-house input channel
strip should remain the responsibility of the remotely located
has conventionally been handled by dedicated equipment
separate from the audio system, can be merged into the audio
front-of-house mix operator. The UI may be hosted by a
system. For example, each end node might have video inputs
device that is not co-located With any of the audio sources.
Such a device may be implemented on a client computer, or
and outputs (e.g., a camera and a display) in addition to its
on a portable device connected Wirelessly to the netWork,
such as a smartphone or tablet.
audio inputs and outputs. Users located at separate end nodes
can exchange visual information in addition to their primary
mode of interaction involving audio signals. Such a system
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF