Life beyond 20kHz

February 27, 2016
 by Paul McGowan

"At least one member of each instrument family (strings, woodwinds, brass and percussion) produces energy to 40 kHz or above, and the spectra of some instruments extend beyond 100 kHz. Harmonics of muted trumpet extend to 80 kHz; violin and oboe, to above 40 kHz; and a cymbal crash was still strong at 100 kHz."

Studies show we cannot hear beyond 20 kHz, most of us less than that. Yet we recognize higher sample rates sound better - and higher means higher than we can hear - which the measurementists claim is poppycock. But, what if we can hear above 20kHz? Might that explain some of why we like extended bandwidth equipment?

James Boyk, of the Caltech Music Lab (yeah - I thought they only did spaceships too) wrote a fascinating paper entitled There's Life Above 20 kHz .

Given the existence of musical-instrument energy above 20 kilohertz, it is natural to ask whether the energy matters to human perception or music recording. The common view is that energy above 20 kHz does not matter, but AES preprint 3207 by Oohashi et al. claims that reproduced sound above 26 kHz "induces activation of alpha-EEG (electroencephalogram) rhythms that persist in the absence of high frequency stimulation, and can affect perception of sound quality."

Oohashi and his colleagues recorded gamelan to a bandwidth of 60 kHz, and played back the recording to listeners through a speaker system with an extra tweeter for the range above 26 kHz. This tweeter was driven by its own amplifier, and the 26 kHz electronic crossover before the amplifier used steep filters. The experimenters found that the listeners' EEGs and their subjective ratings of the sound quality were affected by whether this "ultra-tweeter" was on or off, even though the listeners explicitly denied that the reproduced sound was affected by the ultra-tweeter, and also denied, when presented with the ultrasonics alone, that any sound at all was being played.

From the fact that changes in subjects' EEGs "persist in the absence of high frequency stimulation," Oohashi and his colleagues infer that in audio comparisons, a substantial silent period is required between successive samples to avoid the second evaluation's being corrupted by "hangover" of reaction to the first.

Boyk's own conclusion suggest that if true, and there seems ample evidence it might be, then hard filtering everything above 20 kHz, as in a CD, might just be the worst thing we can do - and explain much about why higher sample rates makes sense, even though we can't technically hear above them.

It's just one more possible nail in the coffin of the measurementists who steadfastly refuse to recognize what many of us perceive just might be right.

Subscribe to Paul's Posts

26 comments on “Life beyond 20kHz”

  1. Paul, it smells like it has promise in it, but all my experiments say that such things have nothing to do with getting satisfying playback - rather, higher sampling rates work because the electronics are in better shape while doing their job ... I've done the exercise of taking an mp3 file, relatively heavily compressed, turning into a WAV file, then upsampling it to higher and higher rates, saving it to file each time - each file got bigger as I went. Then, on playback the mp3 sounded worst, the WAV better, and each increment of hi res size got better, at each step. Specifically, the treble improved, sounding more "natural" as I went ... no more information was recovered by doing this, so why the effect?

    My belief is that the circuits of the DAC and following have an easier time in doing their job when fed the audio data at these higher sample rates - and everything else follows ...

      1. So called "brick wall" filters are not brick wall at all, at least not in the analog domain. They affect frequencies well below their 3 db down corner frequency, usually enough to be audible. Even a single pole 20 khz butterworth filter will do that, 1 db down at 10 khz. The more poles there are to get the most rapid falloff, the more likely they are to be tuned to slightly different frequencies multiplying the impact they have in the audible range significantly. This is why it was necessary to oversample digital recordings as though they were recorded at much higher sample rates. This allowed the filters to be tuned to frequencies far beyond the audible range and therefor have negligible effect in the audible range.

        In one experiment I heard about third hand a Japanese professor of electrical engineering created a demonstration to prove to his students they couldn't hear beyond 20 khz. He produced a wideband recording and used a tweeter with extended frequency range. He then had a filter that cut the signal above 20 kz in and out. The students heard the difference. Then the professor realized his experiment was flawed because the signal above 20 khz caused distortion in the tweeter in the audible range. He revised the demo using a separate tweeter above 20 khz and got the expected results.

        Amplifiers don't cut off at exactly 20 khz either. If they are designed that way they often have rolloff below 20 khz especially as power output increases. This is what so called slew rate distortion or TIM distortion is about.

        Demos and experiments that lead to the conclusion that you can hear beyond 20 khz should be carefully examined for errors and flaws in methodology.

        Note to Acuvox, a full understanding of frequency response includes not just amplitude frequency response but phase frequency response as well. In so called zero phase systems they are directly related to each other so you only need to know one of them, usually amplitude frequency response. In non zero phase systems which includes multiway loudspeaker systems and hearing, they are not related. The electrical phase response that Atkinson publishes for his loudspeaker measurements are only part of the story, the part that the amplifier must deal with. The acoustic phase response is entirely another matter and is far more complex and much worse.

  2. The Oonashi paper has garnered some attention in the audiophile community, but in academic circles less so. There is an attitude of 'not buying it' in play. In any case, as many people have rightly pointed out, these results - now approaching their 25th birthday - have yet to be validated elsewhere. There is no evidence that Oonashi himself followed upon it. Note that negative results - i.e. results showing that a certain thing didn't happen - are rarely considered publication worthy. In this case, results refuting Oonashi would almost certainly fall into that category. Maybe even follow-up results obtained by Oonashi himself?....

  3. Your quote: "It’s just one more possible nail in the coffin of the measurementists who steadfastly refuse to recognize what many of us perceive just might be right.". While I understand what you are trying to say, "measuring of EEG's" is still measuring. This is a perfect example of the argument that when listeners claim to hear differences but EXISTING measurements say that those audible differences can't exist, then that may only mean that we have not figured out either how to measure the differences or what to measure. A perfect example in audio is the very audible differences tube traps make in room acoustics. Art Noxon had to "invent" the MATT test to actually be able to measure what he [and everyone else] was hearing.

  4. Let me correct the statement that, "We cannot hear above 20 kHz." It is, in fact, erroneous. While we cannot hear above 20 kHz with our conventional auditory anatomy, we CAN hear well above 20 kHz through our bones. This may actually account for the difference in test equipment measurements and what we actually perceive. Years ago, I was involved in the development of an innovative hearing aid for completely deaf individuals. This device converted sounds in the 100 Hz - 20,000 Hz range to 20,000 Hz - 40,000 Hz range, and delivered the signals to the brain via bone conduction. A transducer was attached behind the subject's ear, pressing against the hard bone located there behind the ear. It was incredible to see people who had been completely deaf for much of their lives suddenly be able to hear and understand sounds, and in certain cases even understand spoken words. This demonstrates the ability to receive and interpret signals above 20,000 Hz -- not through our conventional auditory system, but through the bones in our bodies. This phenomenon may well account for our enjoyment of higher bandwidth audio equipment.

    1. This post totally made me think back to my 70s days when I had a product called a "Bone Fone" which you draped around your neck and it vibrated in such a way as to produce sound in your ears via the vibrations. I had totally forgotten this until now. Unlike egg shaped stereo chairs they seem to not command much in the vintage market they are on eBay for 45 bucks. I bet they sound awful, it has been so long ago I have no recollection of how they actually worked 🙂

  5. But in science, experimental results need to be repeatable to actually mean something. AFAIK, there's have been very few attempts to repeat these results (maybe only one serious one) and the results weren't repeated.

    So this study is a curiosity, and not much more than that at this point.

    One of the problems with all this stuff in audio is that proper blind studies are difficult to do, and cost money. There's not much interest among qualified people to conduct them, and even less money. So very little proper testing in these areas has been done, as surprising as it seems.

  6. Would anyone like an anecdote with their morning coffee? My hearing is not great at high frequencies, but at an audio show about 10 years ago I had the opportunity to "experience" the inclusion and omission of a super-tweeter in a very good sounding system, through two cycles of inclusion and omission. While this was not rigorous science, I was able to recognize that music within my range of hearing was more relaxing and pleasurable, seeming to require less brain power to comprehend the signals in the various instruments in the orchestra. Unfortunately, the super-tweeter was quite expensive. I'm inclined to go along with the bone conduction theory mentioned above.

  7. When we say we can't hear above 20 kHz we are speaking about individual sine waves since that's what is used for the tests. But sound(music) isn't single sine waves. It's an algebraic combination of 'sine waves' into a single complex wave form that is constantly changing. I'll bet that if we look at a picture of the wave form of a sound segment with utra sonic components on an osciliscope with and without the ultrasonic components the wave form will be slightly different. The question then becomes whether that segment sounds different played back filtered and unfiltered. That's certainly a different test than the standard sine wave test. I don't know the answer. But I'd like to hear some test results.

  8. This is getting almost as tiresome as the cable debates. But since this is a place to share knowledge and experience here's my view. I posted this on Dr. Mark Waldrep's web site a few months ago. Dr. Waldrep goes under the moniker of Dr. Aix and occasionally posts here. He's a recording engineer who specializes in recordings that have content over 20 khz, in fact out to 46 khz or more. He correctly points out that as of today there are only a few thousand recordings that contain much content over 20 khz, his being among them. This is because to get this content into a recording the entire process must have sufficient bandwidth and most don't, they fail in at least one place in the process, usually many. Here's what I posted quoting Waldrep's own words from a lecture he gave.

    "What about High resolution audio? Here’s are some authoritative quotes [From Dr Waldrep]; “The honest truth is that at the end of the day if I played a 44.1,16 bit recording in here and I played my 96K 24 bit originals none of you could tell the difference. My friends in the mastering community can’t tell the difference.” “Guitar Noir is one of our best selling albums….notice that the full scale of 96 24 goes up to 46 khz. …there are frequency components that actually exceed 20 khz. But Mark you said earlier we don’t hear that. I don’t care. My definition of fidelity means that if it’s coming out of…..I’d like to capture it ….maybe, just maybe there is something going on in our brains…..” “Folks, we’re deluding ourselves….compact discs, this is our reference folks, this is our standard definition audio. It’s not hi rez, it sounds wonderful and anybody sits there and says oh yea, they can instantly hear the difference between a CD and a hi rez file they’re crackers, it’s not easy to do. We did tests. They played them through headphones, they played my 96K 24….downconverted from 96.4 to 44.1. Well guess what, every recording that they used as a test…”

    I’d listen to this guy. He seems to know what he’s talking about. The difference between RBCD and hi rez is very subtle and there are only a few thousand hi rez recordings available.

    I have no dog in this fight. I’m not in the biz. I’ve got about 3000 vinyls and 3000 cds, in very round numbers. I also have turntables that I like very much to play the vinyls on and lots of CD players. But I rarely listen to vinyl. For one thing it’s a PITA. And there’s no remote control. I also appreciate the lack of pops and clicks especially on soft music. CDs that I bought over 25 years ago sound like brand new. They never seem to wear out. Okay, I’m sold, RBCD is it for me. (it was from the first time I heard one but I had to wait until they got the steely sound out of violins before I bought one.)"

    And Dr. Waldrep's reply;


    Mark, you’ve got way too much time on your hands to do all this research.

  9. I make it a habit not to refer to my credentials as they are usually irrelevant to any discussion and don't add anything to who is right and who is wrong. Others who do, use this strategy to intimidate those they disagree with and to make their point by proving their superior knowledge, a false argument and unfair tactic. But this time I'm going to make an exception. Dr. Waldrep has a PHD in Recording Engineering. He is a very smart and experienced man and has a great deal of knowledge and experience. But I have an BE in Electrical Engineering and there is a world of difference. I did the math. Not once, not twice, but hundreds of times. Starting in my sophomore year once I'd mastered a sufficient amount of calculus and for three years straight I cranked out Fourier Transform Integrals and Inverse Transform integrals. Why? Because to become a competent analog electrical engineer this skill and understanding is absolutely necessary but not sufficient. You can't understand how anything works without it. You can't understand amplifiers, filters, negative feedback, how tubes and transistors work, how radio, television, and other communications systems work, how circuits work. Everything that comes afterwards depends on it. No other discipline of engineering and even mathematics majors don't get this pounded into them in anything like the way electrical engineers do. And it's not easy. The actual calculations are hard. Here's the transform the way I learned it. f(j*omega) = 1/2pi*integral from zero to infinity of f(t) *e exponent -j(omega)*t dt. Omega is 2*pi*frequency and e is the base of natural logarithms. There are large tables for Fourier and Laplace transforms of typical functions. Why do engineers use this technique? For one it gives them a different way to see and understand waveforms, phenomena, and processes. For another it is often far easier to solve problems in the frequency domain than in the time domain. To get the output of a circuit, you just multiply the input waveform in the frequency domain by the gain G(j*omega) also in the frequency domain to get the output. Then you use the inverse transform to see the waveform in the time domain. The inverse transform is just as hard.

    Frequency response analysis is used in EVERY branch of science and engineering. One reason is that it explains the principle of resonance, the most commonly observed phenomenon in the universe. It not only ties in with electronic circuits, it ties in with electromagnetic wave phenomena, Newton's laws, in fact everything.

    I've been experimenting with high frequency response within the 20 khz passband of audio systems for about 25 years and learned a lot. Of 64 drivers in my main sound system 40 are tweeters. IMO the top two octaves are extremely important and if you don't get them right, nothing else matters. The top octave is particularly hard. As hard as getting bass right is, this is much more difficult. IMO virtually every loudspeaker on the market I've ever heard or seen got it wrong.

    1. Can you guarantee that the tweeter of your speakers are in phase with the woofers and subwoofers? The step response could reveal this! 😉 I have never ever seen a perfect step response for a speaker with an analog crossover. Digital crossover technology might solve the problem. Thus I wonder what the 20 kHz listening tests really revealed or not!

      1. There are only two direct firing tweeters and they are in the main speaker systems. The woofers are side firing. They are only in phase to the degree that the manufacturer made them that way which probably means they aren't. None of the other 38 tweeters are in phase with the main system. 22 of them fire their undelayed sound at the walls and ceiling and therefore their sound arrives with many phase angles. The other 16 have substantial multiple time delays, no undelayed signals, and therefore have no phase relationship to the woofers.

        That being said, it is a rare speaker where there is phase coherence. To meet that criteria the tweeter must be geometrically coincident with the woofer AND the difference in what is called group delay compensated for geometrically or with digital delays. I've seen a handful that might meet this criteria. The KEF LS 50 might be one of them. Phase coherence does not necessarily by itself guarantee a good loudspeaker system, in fact it may be awful. If phase interference is audible, it is almost certainly because it causes FR reinforcements and cancellations at different points in space and at different frequencies. I'm not aware of any documentation that shows that the kind of phase incoherence relates to anything audible except at bass frequencies where "suckout" which is a steep falloff in response due to cancellation with the reflected wave is a problem. Roy Allison did a lot of research on this problem, wrote a paper, and created designs to address it.

        The reason you don't see anything like a perfect square wave from a loudspeaker is that the tweeter is much faster to respond than the woofer. The first response is a sharp spike, that's the tweeter responding. Then a little later you see a hump, that's the woofer responding. These are acoustic phase measurements on axis. That means this is the best they can do. Move off axis and it will get much worse as the tweeter response falls off sharply in amplitude while the woofer response doesn't to nearly the same degree. Generally HF dispersion of modern tweeters is awful and that is usually deliberate. Even a line array of wide lateral dispersion ribbon tweeters like Infinity IRS has poor vertical HF dispersion. That's considered and advantage by audiophiles. Real musical instruments have very wide dispersion at all frequencies in practically all directions and their reflections in a room are entirely different. That's one reason they sound so different from hi fi speakers.

      2. Very few drivers have the time response required for reproducing musical transients. On the high end of driver bandwidth, this requires exceptionally low inductance. For example, typical woofers have resistance to the acceleration of fast changing signals on the order of 1-4mH, while "fast" woofers with Neodymium magnets or Faraday rings have inductance of .1-.5mH.

        At the low end of the driver bandwidth one must contend with resonance which produces group delay and introduces spurious frequencies in response to transients, which could be first order (step function) or second order (rectangular modulation envelope).

        Digital signal processing can only go so far in correcting these inherent shortcomings in the speaker mechanisms. For example, if you try to follow a complex musical signal while compensating for the resonance of the mass-spring system, the output will go chaotic because of the dynamic shifts in driver parameters like Re. This also takes negative time or negative output impedance, which are possible but impractical.

  10. Once upon a time when I was young and an audiophile myself, I thought that once harmonic distortion, IM distortion, and noise were below the threshold of audibility, the only other factors that mattered were frequency response, frequency response, and frequency response. And then I went to school. And years later when I thought about it, I realized how wrong I was. The fallacy probably comes from an intuitive notion in EE-think. This notion has it that if the FR of everything is flat, then what comes out of the speaker will be exactly what goes into the microphone. The flaw comes from the fact that sound has dimensions electrical signals don't. It has three spatial dimensions. Electrical signals only have amplitude as a function of time. So the sound that comes out of the loudspeaker has little in common with what went into the microphone in these critical dimensions that were lost. And since electrical engineers were in large part responsible for the development of audio electronic recording and playback equipment, they used their EE-think to see the problem. Over nearly a century they've beaten that problem into the ground so many times and so many ways that it isn't going to get any deader.

    It's over and the EEs still haven't won. They need the help of a branch of mechanical engineering called fluid dynamics. Acoustics is a branch of fluid dynamics and this area does deal with directional aspects of energy generated in and passing through gases and liquids in space. EEs trying to solve their problem by further killing it off in assuming there is still some electrical component they've overlooked such as FR beyond 20 khz is not going to work. They could extend FR out to a million khz and it won't get any better. When you hit a dead end, you go backwards until you get to a point where dead ends aren't the only paths left open. That could mean going back to square one and studying the goal which is sound, not electrical analogs of sound. Now there may be ways to get beyond the dead end walls that the current train of thinking always leads to. When seen from this perspective, the problem looks entirely different.

  11. For those not familiar with Mr. Boyk, in addition to being an instructor at Cal Tech (now possibly retired), he is a concert pianist and has made a few recordings. He has also authored articles on audio component selection and matching. At least one of his courses at Cal Tech investigated differences between live and recorded sound.

    So he does not come to the table without credentials.

    1. The people with this argument are grasping at straws. Even if it were true (it really makes no sense to me) the effect is so subtle and there are so many other unanalyzed unsolved aspects to the problem of hi fi sound that are many orders of magnitude greater that this is a ridiculous waste of time, effort, and money. You are not going to slay a dragon by cutting off its toenails.

      I've heard this siren song of extended FR on and off since I was 12 years old. The Audio Fidelity Record's Frey Stereophonic Curtain of Sound with response to 25 khz - "you won't believe your ears", the Harman Kardon A1000 solid state integrated amplifier with FR +/- 1 db 1 hz to 1 megahz, the University Sphericon super tweeter with response to 40 khz. Where are they now? I have two phono cartridges, one Empire 999VE flat to 20 khz, the other a CD-4 Empire 4000D/III with a Shibata type stylus with response to 45 khz. They sound identical to me. (I don't own even one CD-4 record, I got the cartridge from Empire at a trade show for 1/3 the regular price.) In fact greatly extended FR not only doesn't usually solve any problems, it can cause problems that otherwise wouldn't exist such as vulnerability to RF noise that have audible consequences. Building the Dynaco Stereo 120 kit, one task is winding 11 turns of wire from the output around the output coupling capacitor to create an inductor reducing bandwidth to 100 khz. That's far enough.

  12. The physical integrity of the wave-front launched by conventional cone drivers is dependent on the presence of the full range of overtones. As the wave-front moves through space it falls apart. Lower frequencies travel further than higher ones. Tests using earphones are therefore useless. The more complete the panoply of overtones the longer the wave-front will maintain its 3-dimensional shape. There are no fundamental musical frequencies above about 3k Hz so the presence of higher harmonics that we cannot hear directly affects the timbre of those we can,

  13. In my teens I could copy Morse code up to 27 kHz. It's possible I may have been hearing "key clicks," the transients when the tone started and stopped. Nonetheless, up to my mid 30s, ultrasonic motion detectors bugged me. It wasn't so much a tone as a sensation of pressure. I could go in Macy's, rock my head, look at the ceiling and spot the detector every time. Why they left them on in the daytime is beyond me. But in a few minutes I'd get a headache and have to leave. Not a good marketing strategy!

    Even now, there seems a huge difference between 44.1 and 48k sampled audio, and a smaller but noticeable improvement at higher rates. Yet at almost 62 I can barely hear 14 kHz.

    It figures that, once TV flybacks stopped bugging me, everyone would get LCD screens... 🙂

    1. Over sampling is very different than frequency response. But I am not sure this is what you are asking. Over sampling is a way to raise the sample rate - the number of times within a given period a sample (snapshot) is taken of the original analog signal. In the case of over sampling as applies to this discussion, the original analog signal is long since gone so we over sample the digital signal.

      Let's imagine the original sample rate was 44.1kHz. This means 44,000 times a second a sample was taken of the incoming audio, then converted to a digital representation of that sample. If we want to over sample that original set of samples, say 4x, then the outcome would be 176kHz. This would be done in the digital domain through a series of complex math.

      There's no more information to be sampled or had, though some can be guessed at through another process called interpolation. The advantage of this sample rate conversion is the digital filters used to clean up the incoming data are now running at 4 times faster rates than the original, thus relaxing their original stiff requirements - and hence we might get better sound.

      But, the frequency response of the original analog musical signal is unchanged.

      1. In your initial post, you related higher sampling rates to "higher than we can hear," by which I assumed you meant frequncies, not sampling rates. That was what I was asking about.

Leave a Reply

Stop by for a tour:
Mon-Fri, 8:30am-5pm MST

4865 Sterling Dr.
Boulder, CO 80301

Join the hi-fi family

Stop by for a tour:
4865 Sterling Dr.
Boulder, CO 80301

Join the hi-fi family

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram