[The article deals with two’s complement, but the title was spelled as it is as a bit of a joke by Richard. So—no emails to Ye Olde Editor complaining of a typo are needed, thanks. —Ed.]
Most of us understand that PCM audio data “samples” (measures) the music signal many times a second (44,100 times a second for a CD) and stores the result in a number. For CD audio this is a 16-bit number. A 16-bit number can take on integer values anywhere between 0 and 65,535. Integer values simply means whole-number values such as 19, 47,995 and 13,244. But it cannot take on values such as 1.316 or 377½. However, even if you’re fine with that, recall that the music waveforms we are trying to measure swing from positive to negative, rather from zero to a positive number. So we need to be able to record -13,244 as well as +13,244. Fear not, though – it turns out we can work around that. An interesting property of binary numbers can be brought to bear.
A 16-bit number is just a string of 16 digits which can be either one or zero. Here is an example, the number 13,244 expressed in ordinary 16-bit binary form: 0011001110111100. [If I need to explain binary numbers to you, you’re reading the wrong article.] If this were all zeros it would represent 0, and if it were all ones, it would represent 65,535. But there are actually different ways in which to interpret a sequence of 16 binary digits, and one of these is called “Two’s Complement” (spelled with an “e”, not an “i”).
Let’s first talk about 15-bit numbers. A 15-bit number can take on values between 0 and 32,767. Wouldn’t it be nice if we could encode our music as one 15-bit number representing 0 to +32,767 for all those times when the musical waveform swings positive, and another 15-bit number representing 0 to -32,767 for all those times when the musical waveform swings negative? In fact, we can do that very easily. We would take a 16-bit number, and reserve one of the bits (say, the most significant bit) to read 0 to represent a positive number and 1 to represent a negative number, and use the remaining 15 bits to say how positive (or negative) it is. Are you with me so far?
We need to make one small modification. As described, both the positive and the negative swings encode the value zero. We can’t allow two different numbers to represent the same value – it would make doing any math impossible – so we need to fix that. We allow the negative waveform swings to encode the numbers -1 to -32,768, and the positive waveform swings to encode the numbers 0 to +32,767. So now the value zero is unambiguously encoded as part of the positive waveform swing. This gives us a system that can encode all of the integer values from -32,768 to +32,767 which makes us very happy.
What I have described, in a roundabout way, is conceptually similar to what we actually end up doing. The actual solution is referred to mathematically as the “Two’s Complement” of our 16-bit number, and shares a lot with my simplified description. Two’s complement lets us express 16-bit data in a form that covers both positive and negative values, which makes us very happy. It differs from my more simple description by taking all the negative numbers and turning their zeros into ones, and their ones into zeros. This is done for mathematical reasons that I won’t go into here.
It turns out that two’s complement also makes computers very happy, because numbers represented as two’s complement respond identically and correctly to the integer arithmetic operations of addition, subtraction, and multiplication. So we can manipulate them in exactly the same way as we do regular integers. In fact, two’s complement representation is so inherently useful to computers that software engineers have devised a much more user-friendly term for them – Signed Integers.
Two’s Complement (or Signed Integer) representation is such a huge convenience that virtually all audio processing uses this representation. For example, simple signal processing functions like Digital Volume Control are more efficient to code with Signed Integers.
There is one thing to bear in mind, though, and it catches some people out. Recall that the negative swing encodes a higher maximum number than the positive swing. Here I am going to shift the discussion from the illustrative example of 16-bit numbers to the more general case of N-bit numbers. The largest negative swing that can be encoded is 2^(N-1) whereas the largest positive swing that can be encoded is 2^(N-1) – 1. Where this becomes important is to note that the ratio between the two is not constant, and varies with N, the bit depth. This comes into play if you are designing a D-to-A Converter with separate DACs for the negative and positive voltage swings. You need to design it such that the negative and positive sides both reach the same peak output with an input value of 2^(N-1), while recognizing that the positive side can never see it in practice, since it can only ever receive a maximum input value of 2^(N-1) – 1. There is at least one high-end DAC that I know of which does this intentionally wrongly, and makes a wrong-headed claim that they are uniquely solving a known problem.
Similar considerations exist when normalizing the output of a DSP stage (which should properly be in floating point format) for rendering to integer format. The processed floating point data is typically normalized to ±1.0000 and it would be an error to map this to ±2^(N-1) in Two’s Complement integer space, because this would result in clipping (or worse) of the positive voltage swing at its peak. Instead it must be mapped to ±2^(N-1) – 1.
Such things make a difference when you operate at the cutting edge of sound quality.