*we can relatively easily reconstruct the original waveform using the stored numbers*”. However, like a lot of digital audio, once you start to look closely at it you find that what is easy from a mathematical perspective, is often mightily tedious from a practical one. For example, Claude Shannon (he of the Shannon-Nyquist sampling theorem) proved that the mathematics of a perfect recreation of the analog signal involves the convolution of the sampled data with a continuous

*Sinc()*function – I have described this in some detail in Copper 23. However, if you were to set about performing such a convolution, and evaluating the result at the interpolation points, you would find that it involves a truly massive amount of computation, and is not something you would want to do on any sort of routine basis. Nonetheless, convolution with a

*Sinc()*function does indeed give you a mathematically precise answer, and interpolations performed in this manner would in principle be as accurate as it is possible to make them. So if a convolution is not practical, how else can we recreate the original analog signal? What we do is make a sensible guess for what the interpolated value ought to be, and pass the result through a digital brick-wall filter to filter out any errors we may have introduced via our guesswork. If we have made a good guess, then the filter will indeed filter out all of the errors. But if our guess is not so good, then the errors can contain components which fold down into our signal band and can degrade the signal. This filtering method typically has the disadvantage (if you want to think of it that way) of introducing phase errors into the signal, and has the effect that if you look closely at the resulting data stream you will see that most of the original 44.1kHz samples will also have been modified by the filter. Up-conversion in this manner is usually performed by a specialized filter which in effect combines the job of making the good guess and doing the filtering. When up-converting by factors which are not nice round numbers (

*for example, when converting from 44.1kHz to 48kHz, the conversion factor is 1.088x*) the same process applies. However, it is further complicated by the fact that now you cannot rely on a significant fraction of the original samples being reusable as samples in the output. For example, if converting from 44.1kHz to 88.2kHz, every second sample in the output stream is derived from an interpolated value. The interpolated values, which contain the errors, alternate with original 44.1kHz sample values which, by definition, contain no errors. It can be seen, therefore, that the resultant error signal will be dominated by higher frequencies that were not present in the original music signal and can therefore be easily eliminated with a filter. On the other hand, if I am converting from 44.1kHz to 48kHz, then only one in every 160 samples of the 48kHz output stream will correspond directly to original samples from the 44.1kHz data stream (you’ll have to take my word for that). In other words, 159 out of every 160 samples in the output stream will start off life as an interpolated value. Therefore the quality of this conversion is going to be entirely dependent on the accuracy of those initial interpolation guesses, or more specifically, the accuracy of the algorithm used to make those guesses (

*a more complicated topic that I won’t be going into here*). Again, the process of making a best guess and doing the filtering is typically combined into a specialized filter, but the principle of operation remains the same. Down-conversion is very similar, but with an additional wrinkle. Let’s start with a very simple down-conversion from 88.2kHz to 44.1kHz. It ought to be quite straightforward – just throw away every second sample, no?

**Here is the problem: With a 44.1kHz sample rate you cannot encode any frequencies above 22.05kHz (i.e. one-half of the 44.1kHz sample rate). On the other hand, if you have a music file sampled at 88.2kHz you must assume that it has encoded frequencies all the way up to 44.1kHz. So before you can start throwing samples away you have to first put it through a brick-wall filter to remove everything above 22.05kHz. Once you’ve done that then, yes, it is just a question of throwing away every second sample (a process usually referred to as decimation). This additional wrinkle makes the process of down-sampling by non-integer factors rather more complicated. In fact, there are two specific complications. First, you can’t decimate by a non-integer fraction! Secondly, because you’re now interpolating a signal which may contain frequencies that would be eliminated by the brick-wall filter, you need to do the interpolation first, before you do the brick-wall filtering, and then the decimation last of all (**

*No!**I’m sorry if that’s not immediately obvious – you’ll just have to stop and think it through*). In summary, to get around these two issues, the process of down-sampling by a non-integer factor will usually involve (i) interpolative

**-sampling to an integer multiple of the target sample rate; (ii) applying the brick-wall filter (matched to the final desired sample rate); and finally (iii) performing decimation. I hope you have followed enough of what I just wrote to at least enable you to understand why I always recommend sample rate conversions between members of the same “family” of sample rates. One family includes 44.1kHz, 88.2kHz, 176.4kHz, 352.8kHz, DSD64, DSD128,**

__up__*etc*. The other includes 48kHz, 96kHz, 192kHz and 384kHz. If you feel the need to up- or down-sample, try to stay within the same family. In other words, convert from 44.1kHz to 88.2kHz rather than 96kHz. And convert from DSD64 to 176.4kHz rather than 192kHz. But in any case, SRC does involve a substantial manipulation of the signal, and the principle that generally guides me is that if you can avoid it you ought to be better off without it.