Nyquist
Join Our Community Subscribe to Paul's PostsSwedish electronic engineer Harry Nyquist figured something interesting out. If you want to capture sound using digital audio conversion, you need to sample at twice the frequency you hope to preserve.
Thus, if your goal is to capture without loss frequencies as high as 20kHz, you need to build a system that gathers twice that frequency—40kHz. Add to that requirement the fact such a system gets wigged out if you feed it frequencies higher than the maximum sample rate, one is required to make sure a steep filter is applied before conversion from analog to digital.
Which is how we wound up with CD’s sampling rate of 44.1kHz. We need the 40kHz part to keep Harry Nyquist happy, and the extra 4kHz bit to keep engineers tasked with building a brick wall filter from jumping out of windows.
But here’s the thing. If Nyquist was correct (and he was) that we can capture with perfection half the frequency of our sample rate, why do we need higher sample rates?
After all, we can’t hear anything above 20kHz (and most of us can’t hear that high).
The answer lies not with Mr. Nyquist, but instead with the challenge of steep low pass filters. As my friend Robb Hendrickson puts it: “Whether you’re recording at 44.1 or 48kHz, you are LITERALLY applying a low-pass filter to NATURE!!!”
Indeed, it’s not the lack of 20kHz (because there is no lack of it), but the effects of filtering we hear.
If we need to apply a low pass filter to nature, it had best be really, really, high. Like 100kHz to 200kHz.
Harry gave us only half an answer. The other half is figured out by our ears.
What interests me is that the issue of oversampling at the D/A stage when we listen. The following is just my current understanding of the topic and I may need to be corrected on some points.
As far as I am aware we can take a CD or track at 44.1kHz and our playback equipment will ‘oversample’ it many times so that the sampling frequency is many times higher. This does not add any more information (or “detail’) to the sound but does allow a much less steep filter to be applied when the D/A conversion takes place. The result is a much more natural sound than if a steep “brick wall” filter was used along with a lower sampling frequency of 44.1 or 48.
So why are the hi-res tracks that we listen to delivered to us at 96 or 192 kHz? Why not just upsample tracks from 44.1 or 48kHz if the 96 or 192 kHz tracks provide no additional audible information than the 44.1 tracks?
I know that some listeners can hear the difference between tracks sampled at 44.1kHz and 96 kHz or higher. Surely there must be some other benefit to higher sampling rates than just the ability to use more gentle filtering?
I should clarify that I am referring only to the sampling rate of the tracks we actually listen to on our systems at home – from CD or other storage media. I am aware that recording engineers use higher resolution formats when they capture and mix the music for a number of different reasons. I should also point out that I am referring only to PCM sampling rates, and not DSD.
Isn’t the potential improvement of sound quality and sampling accuracy gained from the step 16 bit to 24 or even 32 bit than from just doubling the sampling frequency? 🙂 Not to mention the headroom gained for digital mixing and mastering – or digital volume control.
Part of the problem is answering the question “what is audibility?”
If you use the old example of drawing a curve on a piece of graph paper, then coloring each square with part of the curve in it in fully, it still represents the curve, but are the inaccuracies audible?
Then there is the question of what is audible vs. what is perceivable.
My favorite example of this was a few years ago, Wilson Audio gave a great demonstration featuring their Thor’s Hammer subwoofers. They played an ordinary music track that was recorded live, and that had no sub-20 Hz musical content.
First they played it with the subwoofers on, then they switched them off.
When switched off, your sense of the space of the room the music was recorded in virtually collapsed. When the subs were turned back on, the sense of space returned.
This is all to say that while we hear from 20 Hz – 20 KHz, we may perceive beyond that to generate our sense of space.
There are still many reasons why you can walk into a room with a $1M+ audio system and know instantly whether you are hearing a recorded piano or a real one.
One of the exhibitors at RMAF used to show this quite simply, by having a piano in their room and giving occasional performances by a pianist.
People in the adjacent room with the audio system would oooh and aaah over the way the system sounded, but from the moment the pianist hit the first note in the room with the piano, it was obvious that the reproduced audio wasn’t even CLOSE to what you experienced hearing the real thing.
To kind of bring this back to digital audio, the next time you are near a real piano, try to spend some time observing and noting in detail what the sound of the hammer striking the strings – the initial impact of the note – sounds like, even when the piano key is struck very forcefully.
Compare it to the best digital audio playback you can find on the best DAC you can hear.
To date I’ve only heard a very small handful of DACs that get that impact right rather than have it sound hard and metallic, and that includes most of the most highly classified units in the recommended components listings of both Stereophile and TAS.
I personally have a piano track I use and I can tell within a few seconds at most if the DAC is getting it right or not; the majority of the time it’s not even close, but so many people are so used to the way most DACs get it wrong that they no longer notice that most still do.
It’s not unlike the way people who have only lived with conventional coned speakers get so used to hearing the enclosure they no longer notice it is present… until they hear a speaker that doesn’t do that.
But if you’ve lived with electrostats, ribbons, planars or open baffle speakers (like those from PureAudioProject) for a few years, you can INSTANTLY hear the enclosures of 98% of the speakers playing at an audio show from note one, regardless of their price point.
Dan Brown of DBX fame did research into the “audibility” question and determined through experimentation that it is necessary to reproduce source frequencies as high as 70 kHz in order to retain the natural spatial imaging in the program. Even though the ear is incapable of registering the fundamental frequencies, the way the ultrasonic components combine with the audible fundamental frequencies results in artifacts that the brain uses to determine spatial cues and timbre.
Though I’ve not experimented with this myself, I find it fascinating and someday hope to explore it further.
This “standard” has always baffled me. What if the samples are taken at the two zero crossings?
I’m sure that was for voice transmission which was what was happening when Nyquist formulated that.
This is where I kinda lose interest…in the ‘nitty gritty’ technical side.
I use my ears to match audio equipment & set-up loudspeakers in a room…that’s my ‘audio shtick’
Of course I’m glad that HN & guys of his ilk are/were interested in the highly technical side of audio, for without them…
I just see the nitty gritty technical side as a means to an end, namely, to ‘fit’ live music into a can the best way possible.
Me too. I presume someone also invented the mains plug. I’m an audio consumer and I’m only really interested in what’s inside and how it works if it affects my purchasing decision. On the subject of nitty-gritty, the limit of my technical knowledge is the difference between 80 and 320 gritty-gritty.
Reading this post, the first thing that came to mind was the far more interesting Sven Nyquist, the great cinematographer who shot many of Ingmar Bergman’s films. The first film I saw with my wife was The Ox, one he also directed.
Steven,
Haha…’Bodyline’ vs ‘Gritty-Gritty’ 😉
I agree with fat Rat, the technical end with khz, sorry(kHz), bits and db’s, I get a little lost here. My EAR’S rule and set the law. Some what “off today’s topic and going back just a few here. Tone Controls, I like a crisp high end and the punch on the bass. Back when, Tone Controls gave me just that. I really don’t think it had anything to do with weather my speakers were bad speakers and I needed controls to make them better. Actually they sounded pretty dam great at a flat setting. In fact not realizing that I was building a “sound stage” back when, my speakers were off of the wall’s and on the opposite side of my room at the time and the sound I received from them was fantastic! I was experimenting with a less expensive way to get great sound delivered to them and decided to use door bell wire. Solid copper wire that I could form all the way around the room that would fit just under the baseboard and carpet and could be shaped easily. Today my setup is different and my rig is “better” “newer” and I have no tone controls??? I think my ears get better? something that I’ll think about in my upcoming listening session. My Administrator is hanging with mom Saturday 🙂 Ok so keep listening folks 🙂
Bob Stuart of Meridian new this way back, when I purchased my first 96kHz/24Bit speakers and a compatible 500 CD Player.
Future MHR frequency doubling was part of the deal.
This was followed by a controller which featured Trifield, the introduction of a third centre speaker.
MQA is the latest innovation from Mr Stuart and the huge number of improved SQ releases from Warner and Universal bear testament to its efficacy. Listen and enjoy.
Oh and if you get the chance get vaccinated
JMHO
Bob Stuart of Meridian knew this way back, when I purchased my first 96kHz/24Bit speakers and a compatible 500 CD Player.
Future MHR frequency doubling was part of the deal.
This was followed by a controller which featured Trifield, the introduction of a third centre speaker.
MQA is the latest innovation from Mr Stuart and the huge number of improved SQ releases from Warner and Universal bear testament to its efficacy. Listen and enjoy.
Oh and if you get the chance get vaccinated
JMHO
Oversampling alone will not fix all the problems of material which is originally delivered at 16/44.1. At this relatively low sampling rate, an anti alias filter has to be applied before the conversion to digital, and this filter adds its own imperfections to the encoded data. If the source is originally sampled at much higher rates, the anti alias filter can be reduced in slope, or perhaps even eliminated entirely for 352.8 PCM. This is also a big advantage for DSD, where no anti alias filter is required in the A/D process.
For playback, the Nyquist Theorem is correct, with the exception that it requires “perfect” digital filters to exist. If we could actually create perfect digital filters, with no phase shift, infinitely steep slope, and no ringing, then 44.1 at 20 bits or so would likely be all that we need (the extra bits are useful for mixing headroom). Unfortunately, we cannot create uncompromised filters in either the digital or analog domains, and this where higher resolution audio, and especially DSD, becomes an advantage. To my thinking, it is too bad DSD never really caught on as a delivery format: record everything at 352.8/32 giving the ability for all the editing and mixing one might need, and then convert with the best re-sampling software (now probably HQPlayer PRO) to DSD 128 for distribution. Any testing I have done using quality sample files has suggested that once we get to 352.8 PCM as the original rate, conversions, and the difference between PCM and DSD either disappear (in the digital domain), or become imaginary (although D/A conversion still favors DSD) as long as the files are converted to analog via the DSD path with a good DSD conversion path.
Oversampling adds no information, it just repeats the same 44.1 KBPS signal multiple times at a faster rate. This allows the D/A converter to treat it as though it was a much wider bandwidth signal. This allows the antialiasing filter between the highest analog frequency and the scanning rate frequency to be at a much higher frequency where its phase shift is far beyond the audible range.
The sampling frequency is twice the highest frequency to capture both the positive and negative parts of each waveform within the time window allotted to the sample. Frequency and time are inversely related. I published it recently but here it is again. It’s the Fletcher Munson curve extended to 20 khz where the minimum loudness you can hear it at and the threshold of pain are at the same point. Note the dotted curves on the right hand side.
https://en.wikipedia.org/wiki/Equal-loudness_contour
Paul likes Wikipedia, so do I.
The reason MQA can’t work as advertised is that it tries to fold a 3x bandwidth int 1x time bag by downshifting in frequency twice and put them at different voltage levels. The problem is that they crash into each other in time violating the Nyquist criteria. The process is called by its inventor “audio origami.” Origami is the Japanese art of paper folding. Here’s an explanation of how it is supposed to work. https://www.youtube.com/watch?v=T5o6XHVK2HACO9khE It’s explained at starting at 14:24
Since no music has a dynamic range over 96 db the native resolution of 16 bits lets alone 108 db which is claimed can be achieved by adding dithering MQA is a solution that doesn’t work for a problem that doesn’t exist.
Whatever the preference for recordings made or reamastered at HD including those that generally qualify as true HD technology, that preference is not based on the increased technical capabilities of the recording/playback technology but to other factors that are unrelated. to it. I know some audiophiles will argue about it endlessly and there is no convincing them otherwise so this is only intended for people who want some actual understanding of the physics, audiometry, electrical engineering, and mathematics that explains what our science actually thinks it knows. If you are getting your information from magazines, manufacturers, sales reps, all of them having little or no real technical knowledge then go out and waste your money. It’s not my money so I couldn’t care less.
Even amateur musicians made recordings using 24/48 – not RBCD when switching from analog to digital recording! Thus what is the reason for downsampling to RBCD??? My experience with several generations of DACs tells me that the first consumer DAC designs were that poor they couldn’t reveal neither the potential quality of RBCD nor of higher sampling rates! It took some 25 years to make improved sound quality of highres audible – when the recording and mixing was made in a professional way!
Apparently the tools for 24/48 make it easier to edit. I explained the other day that 48 was chosen over 44 for tape in 1983 to prevent pirating of CDs in the digital domain because the 1000th generation would be identical to the first. They never anticipated recordable CDs. The 24 bit depth makes it easier to avoid having to be too careful with loudness inadvertently overloading the signal or getting down into the noise floor. Funny, most music for commercial sales was deliberately overloaded to be louder on car radios when flipping though stations. Nobody cared about the distortion it deliberately created. Considering what they listen to and what they listen on it could hardly matter less anyway.
Sir Richard Peto is probably the greatest or most important statistician and epidemiologist of the current times. Like everyone, occasionally he makes mistakes. But Peto is absolutely brilliant. Listening to his talks you realize your are in front of a very interesting person. He once was talking about medical studies that failed to show effect. He said, words more or less, that you need patients likely to die if you want your premise to save lives. A population with 1% chance of mortality in 3 years will have a very difficult time to show a reduction. Huge studies, enormous sample sizes, unlikely clinical value.
Brian Feagan, from London Ontario, was tired of orthopedic surgeons claiming that their surgeries had positive outcomes in Osteoarthritic knees. Every ORS claimed that the surgeries they performed improved the pain and quality of life of OA patients. He helped design and participated in a study where patients were randomized to surgery or sham. The patient will undergo anesthesia and then the surgeon would open an envelope that told him (or her) to operate normally or just do a little cut on the skin. Patients were not told if they had the sham surgery or the actual one. Follow-up was performed by separate physicians that were unaware of the surgery situation. [All ethical principles as consent were followed] The endpoint showed that surgery had no value. Of course, Canadian ORS hated Brian’s results and bitterly complained that he didn’t know what he was doing and that they knew better. That their patients did perform better. Nonsense, the evidence was clear.
If you really want to know if filters or sampling rate matters, you can test this yourself. Many DACs let you select the filter type. You don’t need to understand the algorithm of the filter or even how to do the test. Or you can be like the orthopedic surgeons who think they know best. Peto would tell you it is not very difficult, even if you think the difference is minimal.
it baffles me why proponents of the Nyquist theorem seems to imply that sampling the highest frequencies with two samples is adequate for their “reproduction”. it’s my contention that this is the reason the top end doesn’t sound complete with RBCD in addition to the fact that everything stops at ~19-21kHz (demonstrable on spectrum analyzers).
SURE, we don’t exactly hear those stratospheric sounds but for one reason or another, we sense when they are among the missing in action. this plays out in the airiness and ease of the playback. mostly, audiophiles like myself point in that direction to illustrate the verisimilitude of LP playback versus CD. that is not to include the higher sampling rate digital recordings. end of rant.
Just take the test to figure out if you can really tell the samples apart. There are tests that you can take online that are double blind and tell you if what you believe is true.
Belief is one thing, testing and measuring your own ability will illuminate you.
CtA,
Your 11:30am post, especially the last paragraph, terrific!
You only have to admit the results to yourself.
(Some people have trouble admitting to others when they find out that their long held beliefs do not hold up)
Nyquist has always been the minimum sample rate. How we got 44.1 had nothing to do with audio quality. Sony developed a technology that could encode digital audio on standard videotape. Its limit was 44.1k x 16 bits. Philips developed a video disk that could be replicated in a vinyl plant. The compact disk was a marriage of those two existing technologies. Ideally, you want to use whatever sample rate allows the use of a gentile filter. Most of the “experts” consider the absolute minimum that isn’t a serious compromise to be around 60k x 20 bits.