It’s a safe bet that many Copper readers are interested in getting involved with higher-quality audio streaming services and digital music servers if they’re not using them already. In this series, Andy Schaub, a contributor to Positive Feedback and other publications explains the technology behind streaming audio and what’s involved in getting set up with high-quality streaming and digital music playback.
Streaming Audio: A Glossary
In order to gain a fundamental understanding of the principles of streaming audio, we’ll begin with a glossary of terms. We’ll build upon things from there.
In the context of home audio systems, streaming means to receive and play a digital data stream containing “bits” of music from somewhere on the internet (from a source like Spotify, TIDAL, Pandora etc.) or from a local cache of digital data (music) files you have stored on a fixed drive or drives. in other words, there’s no physical media involved; you access and control the source, routing and playback of the music with software.
Server (Distributed Server System)
A server is usually a fairly powerful computer or dedicated audio component that is connected to a LAN (lLocal Area Network) or WAN (Wide Area Network) to perform tasks that are too intense for a client computer or other device that controls music playback. The control device can be a mobile phone, tablet, laptop or other device.
The server (the computer where the music resides or is streamed from) may perform tasks for more than one client and can also deal with information exchange between clients and the routing of music in the form of data files from one point to another. Many servers are called “headless,” because they don’t normally need a keyboard, screen or mouse and be accessed over the web. However, they are still just computers and sooner or later you’ll need to reboot them, so you’ll need to have a monitor, keyboard, and mouse on hand even for such “headless” systems.
High-resolution audio is such a generic term that’s it’s hard to define outside of a context. For our purposes, we’ll assume that high-resolution audio means three things (all in the digital domain).
- Little to no loss of information even after the process of compressing and decompressing audio files.
- By convention, better than the CD-quality 16-bit audio bit depth and 44.1 kHz sample rate. Bit depth can be up to 24-bit with typical sampling rates of 96 kHz and 192 kHz, although other sample rates exist.
- There’s usually some attempt to ensure that the reconstructed musical waveform is as true as possible to the original, through various methods of maintaining the proper timing of the musical data with clock signals, and the use of optimal (and often minimal) filtering (AKA interpolation), using math to rebuild what might be missing in the original signal.
All this being said, if the music sounds convincingly “real,” then it’s high-fidelity audio, regardless IMHO of the actual measurable resolution or specs.
A remote app is a “lightweight” app (meaning that it doesn’t use a lot of system resources) that runs on client devices to control a music streaming system. Most apps don’t actually stream the music. They just tell the device (the computer or audio component that functions as the music server) where to get the files and where to send them to play. Things gets a little complex after that, which we’ll go over later.
NAS – Network Attached Storage
NAS or Network Attached Storage refers, in general, to any data storage device that quickly and efficiently stores and retrieves data at very high speeds with little to no error or delay. Its purpose is to “serve” music to a system, via the transportation of digital files over a LAN or WAN as opposed to analog electrical signals flowing through a wire.
It’s often an array of fixed drives in a RAID (Redundant Array of Independent Disks) configuration, housed in a separate box with a simple operating system and no user interface.
Companies like Synology and QNAP make NAS units that serve both general-purpose and also home-entertainment needs. Melco is one brand that makes NAS products that are targeted specifically towards audiophiles. Some Melco and other brands have a built-in CD drive and audio ripping software to convert CDs to audio files – very handy if you want to transfer your CD library for easy-access listening.
Traditionally, a DAC, or digital-to-analog converter, receives digital audio from a disc transport (such as a CD or SACD transport, or the output from a CD, SACD, Blu-ray or DVD player), using an S/PDIF, AES, USB, optical digital (Toslink) or other “legacy” connection. A network bridge is an interface component that takes the data – packets of information – coming from a server before the data goes to the DAC, sorts through the packets, splices them together, and buffers the information to create a stream of bits that spaces almost every “bit” exactly equally far apart in time to reduce or eliminate distortions. The network then converts that stream to a USB or Ethernet or other) output and then sends it to the DAC. Typically, you can control a network bridge using a remote app.
Simply put, firmware is just software that’s loaded directly onto a chip into ROM (read-only memory). This allows the chip to serve a specific function much faster and more efficiently than if the software was running somewhere else. Some firmware can be updated by the user, some can’t. It depends on the architecture of the chip and if the product can be connected to an external source (such as a USB stick, to name one example) to do the update.
UPnP – Universal Plug and Play
UPnP, Universal Plug and Play, is simply a standard protocol and interface that allows streamers and data storage devices to communicate with each other. It’s a little like USB but at the software level and enables hardware devices to “speak the same language.” It’s sort of like a Rosetta Stone for streaming, applications, and devices. Here’s the Wikipedia definition:
- Universal Plug and Play (UPnP) is a set of networking protocols that permits networked devices, such as personal computers, printers, Internet gateways, Wi-Fi access points and mobile devices to seamlessly discover each other's presence on the network and establish functional network services for data sharing, communications, and entertainment. UPnP is intended primarily for residential networks without enterprise-class devices.
Let’s start digging more deeply.
What Kinds of Software Do Audiophiles Use?
There are many streaming services, systems and devices available, but only some are capable of delivering high-resolution/better than CD-quality audio. However, if you’re new to hi-res streaming, how does one sort it all out?
First of all, we need to step away from thinking of recorded media, whether digital or analog, as “software.” It’s actually data that can be stored, transmitted and reproduced. Software is something that tells a computer how to do something. Audiophile-oriented software tends to fall into one of three categories:
- Remote control apps
- Distributed server systems
- Firmware and DSP (digital signal processing)
Remote Control Apps
Remote control apps are a good place to begin, because they’ve actually been around for quite a while, mostly as UPnP (Universal Plug and Play)-compliant websites and web pages that began for use in conjunction with early digital music playback devices from companies like Bryston and Magnum Dynalab. Notable current examples include apps like the Aurender Conductor app for Aurender music servers; the Simaudio MOON MiND controller, and Audirvana, Amarra, and Roon Remote, to name just a few.
All of these apps serve the same basic purpose, which is to give you a “rich” visual control interface for choosing and playing your music from a laptop, smartphone, tablet or dedicated hardware device. The music can be streamed from an internet service like TIDAL, Qobuz, Apple Music, Amazon Music, Spotify, internet-based radio stations and so on, or from a music server or NAS (Network Attached Storage) device. Numerous audio companies offer music servers, including Bluesound, Sony, NAD, Meridian, Linn, Innuos, Wolf Audio Systems, Technics and many others. NAS drives are available from a number of audio companies and other manufacturers. You could also just use a hard drive on your computer along with Plug and Play-compatible software – Twonky is one example – and a remote-control app.
Note that while music services such as Apple Music and others all have their own interfaces, dedicated apps like Aurender Conductor are often required to operate an audio company’s particular music playback component. These apps can also have other advantages, such as making all of your various playback devices, such as multiple hard drives and so on, look like they’re coming from a single source on the app – more about that in a bit – which is a very convenient way to access a music collection scattered on different devices.
Confusing? A little! Put together properly, though, you get to choose all the music in the world (almost literally) or albums and tracks from your own collection (in the form of digital files), just by tapping on a screen that might look something like this:
This illustration shows a Roon navigation screen (from the Roon Labs website).
Roon is a distributed music server system in which multiple “target” devices (DACs, basically, nowadays) and several remote apps are “seen” as a single entity on a single Roon interface.
The Aurender Conductor app and the Simaudio MiND app, as other examples, have much the same functionality but are designed to regard the target device itself (in these examples, an Aurender device or a Simaudio product) as the “brains” of the operation and the actual thing that delivers the sound. Alternately, Roon, Meridian’s Sooloos and other systems use a separate server computer as the “brains” to route and process the data files (aka “music”) to the “target,” a DAC of some kind behind a network bridge, or a “streaming DAC” or other connected device, like a Sonos system or Apple TV.
So – the remote apps all, at a minimum, let you control the “brains” (the “engine”) of the system, regardless of whether the “brains” of the system reside in the playback device itself or in the remote app, computer or in all of these.
Next, where does actual signal processing of the digital files occur? It can either take place in a dedicated DAC (digital to analog converter), or in a distributed music server system with built-in DSP (digital signal processing). Some systems are proprietary (the Meridian Sooloos as one example) and some are more universal (like Roon, or sonicOrbiter by Small Green Computer).
Distributed Server Systems
Distributed server systems are those in which the server is a dedicated computer that among other things acts as a “traffic cop” for the system.
In the following illustration, the “brains” can be in the DAC, in the remote app, reside on an independent computer, or be present on all three. It depends on the software paradigm being used.
Technically speaking, all processing and manipulation of a digital data stream is a form of DSP, because it really just means that you are doing things with sequences of numbers. However, DSP means something specific in most audiophile-oriented literature. It refers to the manipulation of said numbers to necessary or even “better” effect (meaning better sound quality), and includes everything from digital volume adjustment (yes, that can involve DSP), to sample rate conversion, to manipulation of the frequency response for room-correction equalization.
If you think of that “engine” as the place that does 90 percent of the audio-related processing (as opposed to non-audio-related functions like controlling the music selection), with the rest being done by the DAC itself, then it makes sense to put that engine onto a separate computer. This allows for much more powerful processing capabilities, the particulars of which will be covered later.
Some people feel that “digital signal processing” – any extra audio “enhancement” or deviation from the original “bit-perfect” data stream – is a bad thing. However, keep in mind that all digital data streams require some manipulation to be heard as music, so where does one draw the line?
All that aside, there’s a major practical advantage in having the server app run on a separate computer: it allows you to have multiple audio streams go to different devices simultaneously. For example, this can allow different family members want to listen to different music at the same time.
There’s a lot of effort nowadays to sell dedicated music-server-specific computers, like the Roon Nucleus, Small Green Computer i5, Antipodes CX, and others, but they’re all really just computers and they all run more or less the same kind of software. The advantage is pre-packaged convenience, which, admittedly, can be a major benefit.
One thing to consider in creating a digital music system: you’ll have to decide if you want to use a computer as a multi-purpose device to share running the audio server app in a distributed system along with other household tasks (like family members surfing the net or playing games), or if you want a dedicated server machine. The advantages of a dedicated computer/music server, of course, are more storage space and access to more and faster processing power without competing with other tasks.
You can configure a computer-based music system to be operated using anything from a smartphone or tablet-based device to a good old keyboard, mouse and monitor. I just use an old, mostly-dedicated 13-inch MacBook Pro and it’s fine.
Firmware and How it Relates to DSP
As we noted in the Glossary section, firmware is simply software that is written for a task, and resides on a specific device or chip in non-volatile memory. Examples of firmware include the software embedded in DAC chips that convert the digital data stream into analog information, aka music.
The illustration is a functional block diagram of an ESS9016 DAC chip. Where is the firmware? It’s everywhere. Almost everything this whole chip does – not just the actual digital-to-analog conversion – is controlled by software embedded within (stored on) the chip itself. How does this relate to the quality of the music that you’ll hear? Consider: even at this level, with just one chip, the signal processing is at least somewhat distributed and probably not actually bit-perfect, not if you at the very least consider jitter, or timing errors in the signal. It’s necessary to have the timing of the zeroes and ones that comprise the digital audio signal correct, which requires a temporal point of reference – a timecode or clock signal – and a very accurate oscillator or word clock. Without such firmware, you have a piece of rock, not a music-playback device. (Outboard master clocks are available for audiophile and recording studio applications.)
What it boils down to is that all of digital audio is essentially digital signal processing. However, let’s assume for the sake of argument that when most people talk about “DSP” with respect to audio, it refers to deliberately altering the audio signal to suit a technical purpose or accommodate someone’s listening preference. Examples of the former would be digital RIAA correction of a phono input, or digital room correction EQ.
To leave you with a final thought for now: computer-based music listening systems can accommodate analog. I once tried ripping an album (converting the vinyl to a digital file) using a very low-noise FET-based mic preamp and a Goldring cartridge. The RIAA correction was all software-based using an app called Pure Vinyl, with no need to employ an outboard phono stage. Either because of less phase distortion, a more accurate RIAA EQ, less signal loss or all of the above, the result was amazing.
Header image: Audirvana app, from the Audirvana website.