Audio over Ethernet
In
audio engineering (and now in
broadcast engineering),
audio over Ethernet (sometimes
AoE) is the
concept of using an
Ethernet-based
network to
distribute digital audio. It is designed to replace bulky
snake cables, and to use the existing
wiring infrastructure in a facility, providing a reliable
backbone for any
audio application, such as for multiple
studios or
stages.
While on the surface it bears a resemblance to
voice over IP (VoIP), it differs in several very important ways. First, AoE is high-
bandwidth, intended for
high-fidelity and therefore high-
bitrate professional audio, rather than
voice. In addition, it is also for
isochronous and multi-
channel use, rather than independent
streams. Second, it is designed to have very high
reliability, including very low
latency (under one
millisecond) and almost no
packet loss. Because of this, audio over Ethernet is also
uncompressed, which prevents both delay and unwanted
compression artifacts. Likewise, AoE by definition runs on a dedicated
local-area network (LAN), or at minimum on a
virtual LAN (VLAN), so that
quality of service (QoS) is guaranteed to provide uninterrupted and
uncorrupted audio. AoE also does not use
TCP or
UDP for
layer 4 and
Internet Protocol for
layer 3 (see
OSI model), but rather its own
protocol that creates
data packets and
data frames that are
transmitted directly onto the Ethernet (
layer 2) for
efficiency and lack of
overhead. The
word clock may be provided by
broadcast packets.
There are several different
proprietary and therefore
incompatible protocols for audio over Ethernet:
*
CobraNet by
Peak Audio (now owned by
Cirrus Logic) [
1]
**
RAVE by
QSC Audio [
2]
*
EtherSound by
Digigram [
3]
**
NetCIRA by
Fostex [
4]
*
Livewire by
Axia/
Telos (mainly used in
broadcasting) [
5]
*
MaGIC by
Gibson (non-proprietary) [
6]
*
M11*
AES50 (non-proprietary)
**
SuperMAC, Implementation of
AES50 by
SonyOxford [
7]
Using
category 5 cable and
100BASE-TX hubs and
switches, each protocol can generally transmit up to 64 channels at a 48kHz
sampling rate, with 24
bits per sample. Some can handle other rates, such as 44.1kHz (
CD-quality), 88.2 and 96kHz (2×
oversampling), even 192kHz (4×), as well as up to 32-
bit samples, with a corresponding reduction in channel
capacity. On some this is accomplished through
channel bonding, while others use individually-
scalable channels. Each protocol is also designed for different
network topologies, and some use their own
media access controllers (MAC) rather than the one native to Ethernet, which could create
compatibility issues when encountering
traffic from other
network interface devices. AoE is not necessarily intended for
wireless networks, thus the use of various
802.11 devices may or may not work with various (or any) AoE protocols.
The
Audio Engineering Society's
MADI or
AES10, although similar in function, uses 75-
ohm coaxial cable with
BNC connectors instead. It is most similar in design to
AES3, which can carry only two channels (
stereo).
In broadcasting and to some extent in studio and even live
production, many
manufacturers equip their own
audio engines to be tied together with Ethernet. This may also be done with
gigabit Ethernet and
optical fibre rather than
wire. This allows each studio to have its own engine, or for auxiliary studios to share an engine. By connecting them together, different sources can be shared among them.
Logitek Audio is one such company using this approach.
An
audio over IP setup differs in that it works at a higher layer,
encapsulated within Internet Protocol. These systems are usable on the
Internet, but may not be as instantaneous, and are only as reliable as the
network route — such as the path from a
remote broadcast back to the main studio, or the
studio/transmitter link (STL), the most critical part of the
airchain. This is similar to VoIP, however AoIP is comparable to AoE for a small number of channels, which are usually also data-compressed. Reliability for permanent STL uses comes from the use of a
virtual circuit, usually on a
leased line such as
T1/
E1, or at minimum
ISDN or
DSL.