## Quibbles and Bits

# Fifty Years After

Back in 2000, Dr. Michael Unser of the Swiss Federal Institute of Technology in Lausanne, published an interesting technical paper entitled “*Sampling – 50 years after Shannon*”. In this paper he considers the state of the art in digital sampling. It is not a puff piece. It requires a post-graduate level grasp of mathematics if one is to follow it in any serious detail. It mostly goes over the top of my head, for starters. But in doing so, it makes some interesting points, including the dry observation that the so-called “Nyquist-Shannon” theorem handily predates both Nyquist and Shannon!

One key finding is as follows. It reduces the problem of regular digital sampling to a ‘general theorem’. In other words, all methods of digitally sampling a continuous function will be subsets of this general theorem. It goes something like this:

All continuous functions (such as waveforms) can be represented as the sum of a number of “orthogonal functions”. Orthogonal functions are like the *X-Y-Z* axes of a co-ordinate system, where an object’s location in three-dimensional space can be unambiguously specified by its co-ordinates, given by its positions along the *X*-axis, *Y*-axis, and *Z*-axis. If the object moves purely along the direction of the *X*-axis, then it’s *Y*-axis and *Z*-axis co-ordinates will remain unchanged. In fact, I can change its position along any one of the three axes without affecting its position along the other two. It is this property that makes the three axes “orthogonal”. The same property makes for “orthogonal” functions – you can independently change any one of them without affecting any of the others.

An example of this would be the frequencies of an audio signal. I can change the amount of the 1kHz frequency content, and it will have no impact on any of the other frequencies present. The frequencies – or more specifically the sine waves exhibiting them – are therefore “orthogonal functions”.

Usable families of orthogonal functions can range from simple to very complex. The set of families of orthogonal functions may even be infinitely large. The simplest members are base functions such as sine waves. For sine waves, Unser’s set of coefficients is obtained by performing a Fourier Transform. Slightly more elaborate families include such things as “wavelets” which are best described as short bursts of sine waves; and Splines, which are best known as curve-fitting functions. Much interest over the past 20 years has been focused on wavelets, and it seems likely that this will accelerate in the future, as the computing power required to use them to their best advantage becomes more commonplace.

Unser’s paper tells us how to examine any set of orthogonal functions to determine whether they are suitable for representing a waveform. Unfortunately, the test itself is mathematically obtuse, and does not lend itself to a pithy description in plain English. But if a set of orthogonal functions proves to be suitable, then our waveform can be fully represented by determining a corresponding number (in mathematical terms a “coefficient”) for each of the orthogonal functions. We can then store those numbers, and use them to fully and accurately reconstruct the waveform at some future time.

This is Unser’s general theorem of digital sampling, and he uses it to ask and explore some very interesting questions, ones which may well prove to be useful in the near future. But before discussing that, we’ll just take a quick look at how Nyquist-Shannon sampling theory fits into it. Suppose we choose as our family of orthogonal functions the *Sinc()* function:

*Sinc(x) = Sin(x)/x*

As it happens, when we work out what the corresponding coefficients are for the *Sinc()* functions, they turn out to be the real values of the waveform itself as it evolves with time. In other words, turning the whole thing backwards, if we sample our waveform in time, the resultant sample values will be the coefficients of an orthogonal family of *Sinc()* functions which can be used to exactly reconstruct the original waveform.

I have stated glibly that we can choose any family of orthogonal functions which meet some incomprehensible criteria, and fully represent our waveform by storing only the coefficients of these functions. However, this is of no practical use if our family of orthogonal functions is infinitely large, because we’d then have to store an infinitely large set of coefficients. This is where the concept of “bandwidth limitation” comes in.

We are familiar with the Nyquist Criterion, which states that our waveform must contain no frequencies above one half of the sampling rate. In the context of Unser, this means that by reducing our infinitely large family of orthogonal functions to a finite set – such as by eliminating all those which correspond to frequencies above our Nyquist Criterion – we can represent our waveform using a finite set of coefficients. We can apply this kind of logic to any family of suitable orthogonal functions. By appropriately reducing the size of the family to a finite subset we will end up with a finite set of coefficients. The smaller this set can be, the fewer the amount of numbers that would be needed to fully represent the waveform.

For the most part, this analysis appears only to be of use for the purpose of data compression, where it has limited applicability. At the end of the day, information theory already tells us most of what we need to know to determine just how much compression can actually be achieved. But where Unser’s paper gets really interesting is where it heads next.

Unser invokes the Physicist’s “frog on a lily pad”. This is where a frog attempts to cross a lake by jumping from lily pad to lily pad. Each lily pad is exactly half as close to the far side of the lake as the previous pad. The mathematician says that the frog will never reach the other side, but the Physicist observes that at some point the gap will be so small as to be meaningless. Unser recognizes that there is a distinction between a mathematically exact representation, and one where any errors in the representation are practically irrelevant.

Before you get too excited, Unser does not take us anywhere immediately usable with this analysis. He merely illustrates some ways in which this observation can be taken into account within his general theorem. But the concept is an intriguing and useful one. [*It has been suggested – or rather hinted at – that some of these principles may be at play within Meridian’s controversial MQA technology, but at the time of writing MQA’s inner workings remain undisclosed.*] As an example, conventional Nyquist-Shannon theory requires strict bandwidth limitation, but practical anti-aliasing filters can never be perfect. Unfortunately, the “better” the filter, the worse its time domain (*i.e.* phase) response will be. Unser’s analysis may provide a mathematical framework within which practical issues such as this can be formalized.

Wait a while. Once I have my post doctorate degree in theoretical whatever I’ll be able to understand this gobblygook. Until then it makes my head hurt.

I hear you. 🙂

My column is intentionally aimed far deeper into the technical aspect than is normal in our business. My feeling is that we already have plenty of writers who, justifiably so, confine themselves to safely scratching the surface. I’m trying to strike a balance between fluff on the one hand, and incomprehensibility on the other, while attempting to cover my assigned field to a level of depth which may both inform and enlighten readers.

I expected this particular piece would be pushing hard on the incomprehensibility side of the balance, and I therefore welcome your comments (and enjoy the dry humor of your delivery!). I’m not a professional writer, and as such can only benefit from the considered feedback I receive.

Makes my head hurt a bit also, but I love it. Keep it up Richard.

The “frog on the lily pad” paradox was an “interpretation” by physicists of Zeno’s Paradox, Zeno being a pre-Socratic philosopher and mathematician. It’s also know as the “Achilles and the Tortoise” Paradox due to its narrative. As far as we know it was first solved by Aristotle in 300 B.C. So I guess the philosophers’ paradox and solution “handily predates” the physicists’…by several thousand years.

One might also observe that in an actual audio system e.g., an amplifier, no frequency is truly orthogonal from another. Or are we back to Zeno again?

Back to Zeno again 🙂

Even in the digital domain, most orthogonal fields will be at least partially coupled by the non-linear effects of truncation of numerical values to a finite representation! … Limited bit depth to the layman.

Back in June we had what was called “the Shannon Event” at Bell Labs celebrating the 100th birthday of the brilliant mathematician who worked and made his discoveries there.” Shannon’s breakthrough theory will live forever, or for however long it takes for someone to disprove or better it. That certainly won’t be in my lifetime. This is the bedrock of digital information theory. Ironically it is accepted by every professional in the telecommunications industry except by some audiophiles. How funny that truly is. They’ve got all the answers.

Just as there would have been no Einstein without a Newton first, there would have been no Shannon without a Fourier first. That is also bedrock mathematics. Again, only audiophiles don’t accept it. Now how many of them have computed a Fourier transform integral or inverse integral function or for that matter even recognize it if they ever saw it? It’s fun to watch people hit their heads against a brick wall and no matter how many times it won’t budge, they never seem to give up. Latest strategy, so called high definition or high resolution audio. Sorry Dr. Waldrep. By your own admission no one can hear a difference. Maybe you should just pump it up to one megahertz.

Now that I’ve actually read this article, I’ve got a few technical comments. Mathematics is an abstraction which is not an exact analog of the physical world. It’s only the best tool we have. When it comes to two objects approaching each other the model of an asymptote can break down….or does it? Take the collision of two ordinary objects. The fact is that they never really touch each other. The outer electrons of the atoms on the surface of each repels the electrons on the other with a force inversely proportional to their distance squared. No amount of force we can apply will fuse them or make them touch each other and certainly not in ordinary circumstances. Instead they work by “action at a distance”, by their electrical fields which mutually repel each other. While we can calculate the exact magnitude of that force, no one has ever advanced a plausible explanation of how it works. In fact the solidity of the objects themselves is an illusion because as we know, they are mostly empty space.

Back to the problem at hand. We have dependent and independent variables. The independent ones are said to be orthogonal to each other. When it comes to the electrical voltage in a wire it is a function of time. You can break it down mathematically using Fourier’s transform theorem but that is all you have, one independent variable time, and another dependent variable, amplitude. F(x) = A*sine (omega*t) . But this is NOT an analog of sound. This is where your industry is blind and frankly dumb as a stump. Sound has three other orthogonal variables and those are in fact X,Y, and Z. That is right, sound waves travel in space in three dimensions. So there are five variables, not two. The trick I used to deconstruct sound was to turn it into a six dimensional problem but I’m not going to tell you how. This allowed me to understand exactly how sound and acoustics work. That was over 42 years ago and I haven’t paid much attention to what this blind industry has done ever since. Unless and until it comes to grips with these missing dimensions, they are no better at understanding sound than mapmakers who are flat earthers are at understanding that in order to make accurate maps they have to come to grips with the fact that we live on a globe. 50 years on, no meaningful progress, no true added knowledge.

That’s rather a harsh assessment, but a well-reasoned one.

I’m sure I’m not the only one who’d be interested in hearing about the six-dimensional problem….

Okay, I’ll tell you. Read my patent 4,332,979. It’s a simplified version of the full blown theory. The problem starts out as a five dimensional problem, Amplitude versus time and direction of arrival through a closed surface. Normally electrical engineers think of a waveform as either in the time domain or the frequency domain. By a simple mathematical trick the location of each echo in time is left in the time domain but the amplitude of each one is in the frequency domain. This allows you to get a complete account of every arriving echo at a given point from a given source. This solution is superimposed on each dS element of a closed surface, typically a sphere. The first arriving sound is normalized which allows for seeing the change each echo from each direction that the sound underwent before arriving. Where there is more than one source, the resulting field is the superposition of all of the transfer functions, so long as the number of sources is finite. The model therefore allows the total field to be reconstructed rather than reproduced since it cannot be recorded. In a laboratory this can be done with any degree of precision desired from anechoic recordings. But in a home environment from commercially made recordings, a satisfactory reconstruction can only be arrived at through experimenting with each recording individually. There is no one right answer but some are more pleasing than others. I’ve built only two prototypes of the simplified version, the second one still exists but I’ve lost interest in it about a year ago.

The only one who posts here who heard it besides me is Paul. I could see from the first look on his face from the first second it was not anything like he expected to hear.

OK, I follow in general terms what you are describing. This is very much a description and treatment of sound field dynamics, which is a different problem entirely from what I have been discussing, but one of huge potential impact. Your patent predates the availability of the required technology at an affordable price by a good 30-40 years, and has sadly expired in the meantime. A common story with patents 🙁

I am aware of many researchers who are studying in this field for a different, but complementary purpose to your own. They are considering the reverse problem of how to capture an individual sound in a studio setting, and then process it so that when listened to on a pair of headphones (in order to reduce the practical complexity) the captured sound is perceived to be located at a certain desired position (width, depth, and even height) in the soundstage. In a studio setting, that position might be subject to adjustment under the control of the recording engineer. You might think of it as “Pan-Potting: The Next Generation”.

Current research favors implementing the transformation by passing the source signal through a filter with a highly specific impulse response often described as a HRTF (Head-Related Transfer Function). Some researchers openly publish their HRTFs for others to play with – and we have played with them ourselves. Our observations were barely interesting, and nowhere near earth-shattering, but I have been told privately that some researchers have actually made startling progress.

Such work is still largely in its infancy, but there is no doubt that there is almost boundless potential here as Moore’s law drops ever greater processing power into our laps.

It may be a picky point, but Orthogonality is not the same thing as Variable Independence. Whereas all orthogonal properties are indeed independent variables, the converse is not true. The requirement for orthogonality is therefore stricter (and so, in a sense, more useful) than that for linear independence. And the Fourier Transform, far from being ‘all we have’, is but one of a possibly infinite set of different – and, FWIW, orthogonal – analogous transforms.

It is important to note that digital sampling theory

per seis absolutelyan analog of a Sound Field, or even of a Sound itself. It refers merely to a means of representing a time-dependent variable which may (or may not) be representative of a sound. As such it has nothing to say about how a Sound Field evolves with time in a three-dimensional space. We, as an industry, are neither blind to this fact, nor ‘dumb as stumps’.NOTWell, on reflection, maybe some of us are in fact dumb as stumps.

As for ‘

Missing Dimensions‘, without any further clarification on your part it adds as much value to the discussion as a Mpingo Disk.