eXtreme Digital

September 6, 2022
 by Paul McGowan

One of my readers, Mark, asked me to explain what DXD is and why we’ve even heard of it.

Most of us know about DXD because of Octave Records and our penchant for DSD.

When we first started Octave Records we used the Sony Sonoma DSD system (Sonoma was originally developed by Sony and is now owned by Gus Skinas who was on their design team). Sonoma is only capable of DSD64—sometimes referred to as single rate DSD. This is a wonderful sounding format but can be bettered by a higher sample rate DSD system of which there is basically only one.

Pyramix by Merging Technologies.

Wikipedia explains: “Digital eXtreme Definition (DXD) is a digital audio format that originally was developed by Philips and Merging Technologies for editing high-resolution recordings recorded in Direct Stream Digital (DSD), the audio standard used on Super Audio CD (SACD). As the 1-bit DSD format used on SACD is not suitable for editing, alternative formats such as DXD or DSD-Wide must be used during the mastering stage. (DSD wide is a multi-bit DSD style format)

DXD was initially developed for the Merging Pyramix workstation and introduced in 2004. This combination meant that it was possible to record and edit directly in DXD, and that the sample only converts to DSD once before publishing to SACD. This offers a great advantage to the user as the noise created by converting DSD rises dramatically above 20 kHz, and more noise is added each time a signal is converted back to DSD during editing.”

Basically, DXD is PCM running at 8X the CD rate of 44.1kHz or 352kHz at 24 bits.

With that background, here’s an interesting twist that was told to me (and unverified). When Philips and Merging released Pyramix they touted their program was DSD based and capable of being fully edited and mixed. They further suggested that it was all in DSD. As the story goes, Sony freaked out and claimed foul, thus forcing Philips and Merging to rebrand their editing process as DXD so that people wouldn’t think it was DSD.

Merging is said to have claimed “BS” because DXD was sonically invisible to people and no difference could be told. The arguments went on and today PCM 352kHz (which is clearly not DSD) has its own name.

From my perspective, it’s clear to me that if you record in DSD and edit and mix in DXD, it is then sonically invisible. The opposite is definitely not true. Recording in DXD does not sound as good.

In any case, that should hopefully answer Mark’s question.

Subscribe to Paul's Posts

11 comments on “eXtreme Digital”

  1. Interesting that Merging, when hiding the true processes and formats, acted like Mobile Fidelity in their recent dilemma…with similar results of finally having to clearly determine the facts.

    If everyone believed in the acoustic transparency of this DXD process, it probably wouldn’t have been necessary to lie. On the other hand it could indeed be acoustically quite transparent and facts just were hidden, because such transparency can’t be proven and definitely can’t be a fact on a theoretical basis (as at least theoretically no format conversion can be fully transparent, it just can be inaudible on a certain quality of setup or for certain individuals).

  2. The whole format thing seems to drive a lot of enthusiastic “philes” to the edges.

    The lines seem to get blurred, especially between the recording and playback processes and digital formats. The concepts don’t seem to be mutually transferable. For example the DSDAC is taking what ever format it is fed and playing back at 4x DSD rate. Yet PCM at 352 is indistinguishable from 4xDSD in the mixing world. Is it indistinguishable from 4x DSD in the playback world?

    That leads to the question is one topology (R2R ladder versus delta sigma) of a DAC better suited to one format versus another? Or is the final sound of any given DAC more determined by it’s analog out interconnect circuitry?

    1. There seems to me an unfortunate overlap of using DSD/DXD for signal processing, what we usually call upsampling, and using it for recording. As far as I understand, they are completely different processes and should be considered quite separately.

      My system, so far as I am aware, can play a DSD64 file by sending it to my player using DoP (apparently repackaging DSD so it looks like 16-bit PCM but is still DSD) and then the player upsamples the DSD to 40/384 PCM, to which DSP is applied before D/A conversion. I learned all this well after having bought it. Not wishing to go over the edge, as you say, I don’t give it much thought, none at all really. It keeps me happy.

  3. I’m not sure Paul is correct. When Bob Ludwig mixed the first SACD in 2000 he used converters operating at 24/352 and 24/384. They were made by dCS. They were already using 24/384 in their DDC (upscaler). A few years later it was called DXD, as Paul explains. https://dcsaudio.com/timeline

    When I was looking at Mike’s comment yesterday about recording Jurrasic Park Dominion under Covid19 restrictions, I was reminded of Alina Ibragimova’s recording of the Paganini Caprices on Hyperion. It was recorded in extreme lockdown, the solo musician in the hall, the producer in one room and the engineer in another. It’s amongst the best releases of the last year, Gramophone called it a real “being there” quality recording. It was recorded with 24/96 PCM. My other recording of similar quality was recorded in DXD (24/384) on the Myrios label. Unless you actually record the same performance in DSD and DXD (PCM) at the same time, if that is even possible, I don’t know how you can say one is better than the other.

    1. SntbcwS,

      Your last sentence is what makes the most sense to me. Comparisons should be apples to apples so to speak. When comparing formats in the recording world, the exact same performance captured by exactly the same equipment played back with zero manipulation through exactly the same playback set up. Ideally one not upsampled or downsampled.

      To have a real comparison in playback (assuming you can get identical recordings) you can’t have format manipulation – whether it be in your Devialet or a DSDAC to fairly and realistically judge.

      In the end all these endless discussions end with the same goal…. Being satisfied with your playback set up of the recordings and formats you choose to playback.

  4. My experience has been that converter design can make a far bigger difference than format. There could easily be no DXD A to D converter that is as good as the best DSD converter. That could also change in the future.

    1. I can imagine that’s true…and if my logic isn’t nuts, this would prove that those conversion processes aren’t transparent, lossless or sonically invisible.

    2. While common sense logic would agree with you the facts do not. Here’s the thing. All modern A/D converters use delta sigma modulators that first generate an essentially DSD signal that must then internally be converted to PCM.

      Using DSD as the recording medium skips the low pass filter required to generate PCM. While it’s not a big deal it can be heard. One of the most difficult tasks is getting that LP filter to be transparent sounding. Merging managed it in the program Pyramix but only recently. Earlier versions of the program sounded bad.

  5. Mikael Vest of Digital Audio Denmark ran a demonstration of HD formats at the AES Convention back in the oughts. They split a microphone into separate converters in their modular system to record in 16/44, 24/48, 24/96. 24/192, 24/384 and DSD from the same input signal. On the show floor, this was played back through headphones using a DAD headphone amp. (Note that headphones are minimum phase)

    Here is the critical part: the capture was a near coincident pair of microphones in an old school Classical soundstage built and tuned to record orchestras using the variable acoustics of the room for reverb instead of artificially added mush, and the content was solo piano – which is my native language.

    I was raised in a rural environment with a Mason and Hamlin Golden Age piano, and neither radio nor phonograph – but everyone played. My ears are PERMANENTLY “broken in” to the sound of a grand piano. At the time of this demonstration, I had not only a good sounding grand piano in my music room, it was supplemented with a lute strung harpsichord and they were played daily; and I was going to Classical concerts twice a week in the best halls of New York.

    The improvement from 16/44 Redbook to DSD was like candle light to full sun in clarity. I could clearly hear the jumps to 24/48, 96, and 192; 384 was better yet, and DSD beat DXD but barely. Considering that the file size of DSD/DSF is roughly equivalent to 24/96, the question was settled for me.

    BUT, for mixed and mastered multi-tracks, the fine differences beyond 96 are “transparent” because the temporal and spatial distortions of EVERY KNOB AND FUNCTION in mixing and mastering engineers’ tool box mask the delicate ‘air’ of the higher temporal resolution. As soon as you twiddle a knob, the fine details of a physical space are obscured. Most Classical engineers use a lot of mics and knobs, copying pop/rock recording techniques to create a fantasy world with fun-house mirror acoustics. Even audiophiles have ‘broken in’ to this artificiality, with softened transients and musical consonants like a vaseline coating on a camera lens to hide the skin flaws. To them I say “Human Is Better Than Perfect”(TM).

    For that matter, Classical concert goers who sit out past the 10th row in 1,000 seats or more are hearing softened transients and mostly reverb. After my first year in New York I figured out that the articulation and clarity I craved was close to the stage, and that meant you had to get season subscriptions – all the good seats were gone before single concert ticket sales began.

    I record in DSF in part BECAUSE it can’t be mixed, mastered, or altered in any way (except for leveling to 0dBFS, chopping, fade-in and fade-out). You can’t compress, equalize, balance, splice, overdub, nor add artificial reverb without first decimating. This means you need: (1) a room with the right reverb for the piece; (2) musicians who can play a piece from beginning to end with feeling while balancing themselves; and (3) you need to do all the ‘engineering’ BEFORE you hit the RECORD button.

    If you have the skill, you can utilize acoustic balancing, acoustic equalization, acoustic compression, and acoustic reverb. Also, acoustic monitoring – no headphones. Even better, you can do all of this with a live audience, which adds a flavor of excitement you can’t get in a studio – and the un-equaled resolution of a ‘zero knob’ DSF recording.

    This also radically lowers the cost of recording, because even with a rehearsal it only takes four hours to capture and a few hours to edit; and you can also sell tickets to pay for the engineering.

    1. Regarding Acuvox’s comment, after reading his thought process, I thought: what an amazing development of his acoustic neuronal pathways. I hope that when he dies, I hope he donates his body to science, in order to study his acoustic neuronal circuitry. Wasn’t that a bizarre thought?

      1. Yes, this is a bizarre thought because I am neuro-typical of 100,000 years of evolution, and probably much further back in our family tree, both literally and figuratively. This ear shape is common to our arboreal relatives as well – all the great apes have functionally equivalent pinnae, and the self-organizing neural circuits should therefore be analogous. This is how you can hear in all directions at once, and simultaneously locate the acoustic boundaries and acoustically reflective objects.

        Humans can learn to navigate in the dark by making clicks and echo-locating objects, walls, and holes like the blind bicyclists who ride between traffic and parked cars, and also to the edge of cliffs. The more advanced neural circuits enable PASSIVE aural navigation, locating tree trunks, roots that can trip us, and other salient features using the background sounds of the forest.

        For example, hunter-gatherers navigate over many square miles daily under teh rainforest canopy where no direct sunlight penetrates, just by hearing the unique spacings of tree trunks. This ecological niche is so lacking in other directional cues that visitors get physically ill from the dis-orientation.

        Let me put this another way: learning to hear music through conventional audio (2 channel processed and mixed multi-tracks, small drivers in sharp edged rectangular baffles, high order crossovers, vented box, etc.) stunts the development of neural circuits to hear music. Since the phase and spatial information are randomly scrambled, this develops phase deafness as is well documented in the audiology literature. What you will not find are studies which selected professional musicians to test human hearing abilities.

Leave a Reply

© 2023 PS Audio, Inc.

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram