Quibbles and Bits
Sounds Good to Me
What attributes should an item of equipment in a sound reproduction chain possess in order to meet the objectives of high-end audio playback? One attribute we tend to think of as important is flatness of frequency response. If a piece of equipment either boosts or attenuates a particular band of frequencies we tend to consider that a big no-no. Departures from a nominally flat frequency response – sometimes even subtle departures – can often be correlated with some sort of perceived tonal coloration in the resultant sound output. So flat frequency response is high on our list of good things, especially with equipment such as amplifiers.
But with loudspeakers the situation becomes horribly convoluted. Loudspeakers don’t even have a simple frequency response. For a start, their output is highly directional. More comes out in the straight-ahead direction than comes out at some angle to the side, or at some angle upwards or downwards. The laws of physics tend to demand that the proportion of the drive unit’s total output which is delivered off-center is strongly frequency dependent. So if, as you might be inclined to do, you measure a speaker’s frequency response in the straight-ahead direction, and optimize all the design variables so that the resultant on-axis frequency response is ruler flat, then the total energy delivered into the room (the sum of the loudspeaker’s outputs in all directions) cannot be flat at all. And vice-versa.
Then there’s the loudspeaker’s interaction with the room. Only a tiny, miniscule proportion of the total sound output of the loudspeaker travels directly from the loudspeaker to your ear. All the rest of it is launched into the room, where it bounces around off the walls and furniture. Some of it doesn’t so much bounce mirror-like, as diffuse like a spotlight shone onto a matte-painted wall. In any case, eventually, some of that dispersed sound also reaches your ears. When all these sounds reach your ears, having taken a plethora of paths to get there, they will recombine. Each of these sounds, depending on the path they have taken, will have a different phase delay, which means that they can constructively or destructively recombine, and the net effect is a somewhat unpredictable and chaotic disturbance to the overall frequency response. And should you move six inches to one side or the other, the net effect can change dramatically.
To a certain extent, you can attempt to correct for these effects by applying room-correction techniques to the signal. Effectively, you pre-distort the signal in such a way that the speaker/room combination exactly compensates for this added distortion and cancels it out completely. This is a complicated topic of its own, but the short answer, if you want one, is that it works really well at low frequencies and really badly at high frequencies.
Other effects, such as the ‘liveliness’ of a room – in effect the propensity of sound to continue reverberating around a room long after the generating stimulus has passed – add further complexities to the picture which cannot be reduced to a simple disturbance of the frequency response.
Another related issue boils down to the question of what a person who listens to something actually hears. What we actually hear is governed quite strongly by the peculiar shapes of our ears. If we hold the empty tube from a used-up kitchen roll to our ears, and listen through it, we hear an obviously colored sound. We would clearly be upset if our loudspeakers sounded like that. And yet that is precisely what our weirdly-shaped ears do. They effectively color the sound in quite a strong manner. Moreover, this coloration changes strongly according to the direction the sound is coming from. In other words, if you turn your head slightly, the coloration (i.e. frequency response) of what you hear will change slightly. Finally, because your ears have a different shape from mine, what you hear in terms of frequency response will be totally different to what I hear.
Now, to be fair, all of this ear stuff is compensated for to a large degree by our brains, which process what our ears detect, and decide what it is we are actually hearing. And the argument is a good one which holds that – over the short term at least – our ears are unchanging, and whatever colorations they might impose upon incoming sounds our brains are able to compensate for. But even so, at least in tests I have conducted upon myself, applying very small frequency response aberrations to a real music signal played through my loudspeakers, it is quite surprising how large a frequency response error has to be in order for it to be unambiguously noticeable. [NB: You can’t perform this test rigorously just by going into iTunes and playing with the graphical equalizer. It is quite a complex problem, in reality.]
So, an interesting question you might want to ask is this. Why, if the frequency response of a loudspeaker in a room is not remotely close to being flat, should it matter a jot that the frequency response of, say, the amplifier feeding it, is not itself ruler flat? Here we get into difficult territory, because there is plenty of anecdotal evidence which supports the truism that the flatter an amplifier’s frequency response, the better it tends to sound.
Before you start throwing things at me, I’m not trying to suggest that any amplifier with a flat frequency response will sound better than any other with a less-flat response. Painting with a broad brush here, the frequency response of just about any non-flat amplifier can be flattened up nicely with, for example, a hefty dose of negative feedback, and as a rule we find that the application of such feedback is normally deleterious to perceived the sound quality.
But it is an interesting observation nonetheless. And the equivalent situation holds for phase response: the phase response of a loudspeaker in a room is also a total mess, yet amplifiers and other electronic circuitry with only minor amplitude or phase response problems are often found to sound notably less satisfactory in real-world systems.
I would add to that digital processing. Low-pass Butterworth filters often sound fractionally better than equivalent Chebyshev filters, and both of these IIR filters tend to sound better than broadly equivalent FIR filters. There I go with the broad brush again, but when you wash away the noise that is the sort of consistent picture that tends to emerge. I will add this, though. At BitPerfect we have a digital processing engine that allows us to separate out the effects of phase and amplitude response, and the picture that emerges quite clearly is that phase response is by far the more important of the two. In an experiment we have prepared filters which produce a mathematically identical frequency response, but with three different phase responses, one of which is totally flat. Most listeners express a clear preference for the totally flat (i.e. totally linear) phase response.
So there is a definite dichotomy which is still in play. When we listen to loudspeakers in a room, we are hearing a sound which, with the best will in the world, is pickled with serious amplitude and phase distortions across the entirety of the audio band. These massively swamp any corresponding distortions that may have been introduced in the amplification chain. I would go so far as to suggest (not having tried it!) that it would be nigh on impossible to even measure the frequency and/or phase responses of an amplifier using microphones to measure the in-room output via the loudspeaker.
Why, then, are those properties of those amplifiers (and, from my personal perspective, digital audio processing chains) so gosh-darned audible?
Why is the sound of live and recorded music so different that Steve Gutenberg said he could tell the difference out on the street of sound through an open window three blocks away? Because the properties of the sound fields they produce are very different. To close the gap you have to study and understand sound fields and what you can an cannot hear. Until you do you are never going to get anywhere. So these relatively trivial differences between component A and component B of the same type could hardly matter less in the scope of the larger problem. It is also useless to chase ghosts like asserting that you need to be able to reproduce sound above 20 khz with a dynamic range of 144 db. It’s a wasted effort because you can’t hear it.
I sometimes try to figure out why live music sounds so different from well recorded music. My best guess is dynamic range.
Live music sounds much “bigger” than reproduced (which may be why I prefer large line source planars to point source dynamics). And live music is experienced much more as a “full body” stimuli, while reproduced as an “above the waist”, purely auditory one through the ears alone. Reproduced sounds emasculated, anemic. And I’ve heard music through big SoundLabs, Wilson WAMM’s, Magneplanar Tympani T-IVa (which I own) and MG30.7’s, Klipschorns, Altec Lansing Voice-Of-The Theater’s, etc.
But it all starts with the recording. The superiority of a direct-to-disc LP over any tape recording is staggering! The d-t-d sounds so alive, with such immediacy and in-the-room presence, that tape recordings sound lifeless, dead, eviscerated in comparison. But d-t-d is simply impractical for most music (not to mention musicians and singers 😉 ) .
The main difference between live and recorded music is in the reflections. Most of what you hear live is due to reflections. These define the acoustics of space. Sound reproducing systems create entirely different reflections than live instruments would even in the same room. When you talk about a concert venue which can be hundreds of times larger than your room the difference is enormously magnified. Understanding this fact and figuring out how to mathematically model, measure, and construct these reflections which are not on a recording you can buy for complex reasons has been the primary goal of my efforts for the last 45 years. The sound systems I build for myself are designed to entirely different principles and goals than audiophile type sound systems. They are much more complex and difficult to adjust but surprisingly also much less expensive.
I’m new to the world of mid-level (not quite high end yet) sound reproduction, but I always thought that my ideal system, the one I should gradually tend to, is the one that sounds the closest to live music. I understand all the complexities (I mean, I understand that they exist) but I’m wondering if the people that build sound reproduction equipment use the same evaluation criteria: would they, for example, put a guitar player next to all the gear they can come up with in a good sounding room and then test and tweak until they are satisfied that the two sounds are indistinguishable to a blind listener? I’m not talking about reproducing the sound of an orchestra in a big auditorium, but just one instrument in a normal size room. Is it naive to ask why sound technology doesn’t seem capable of doing this? or perhaps it is but the means to do it are not available to ordinary people?
Thank you.
This is the 64 billion dollar question.
There are a multitude of opinions as to what’s required to realistically reproduce sound, and very little consensus. You’ll see those very disagreements right here on Copper.
After nearly a century and a half of reproduced sound, we still don’t have methods that are convincing, much less with all types of music, in all manner of rooms.
Thanks for reading and writing, and sadly, I don’t have a good answer for you.
Sounds quite straightforward, I agree. But it founders on two assumptions. First, it kind of assumes that the business of recording one instrument in a room, with the highest possible fidelity, is relatively trivial. And of course it isn’t. I might observe, though, that PS Audio has moved Gus Skinas and the SACD Mastering Center into a facility in the PS Audio factory, and that it has a recording studio as part of it. And Dave Wilson, of Wilson Audio Specialties, was also an accomplished recording engineer, and that expertise and experience positively informed his own product development efforts. The second assumption is that equipment optimized to reproduce a guy with a guitar will be equally adept reproducing other types of music, and that just isn’t the case. Although you’d think it would be if everything was doing its job perfectly.
But your suggestion remains an ideal that I think nobody in the industry would turn down if the means and opportunity were available to them.
What he said.