Mixing purity
Join Our Community Subscribe to Paul's PostsOctave Records continues to be a learning experience for both Terri and me. Which is one of the reasons we so love what we do.
As we delve deeper into the process of music making some things remain obvious while others surprise and delight.
For example, it’s obvious the better the original recording the closer to live the sound: the right microphone at the right distance with the right preamplifier. And if we were only recording one person, one instrument, one setting the process would be somewhat simple. But, we’re not.
Once you get multiple instruments and singers the process gets complicated. The right everything for an individual may or may not mesh well with the other microphones competing in the room. And then there’s the process musicians like of laying down a rhythm track and adding vocals after the fact. These need to be pieced together so it sounds as if they were live.
Once recorded, it comes time to mix the many channels together into a coherent 2-channel stereo product. And that’s where the challenge becomes exponentially more difficult.
In our minds, we want to maintain the purity of the original recording. Terri and I are purists at heart. So it’s not without a little guilt we sometimes alter the original with EQ or a touch of added reverb or (gasp) a smidge of compression. Truly, this stands the hairs on the back of our necks at full attention.
Yet, carefully applied the end result is a more pure version of sound than the raw recording.
There’s so much for us to learn.
That’s the inherent dilemma of multi-mic recording in a recording room: you produce phase shifts and add dispersion-specific colorations. An instrument doesn’t show a homogeneous dispersion but beams different sound waves in different directions reaching the satellite mic and not the main mic. Why do you not try one-point recordings? No need to invest in expensive mixing consoles adding additional distortions and in highly skilled artistic sound engineers – just focus on the overall sound of the orchestra or ensemble! 🙂
Why shouldn’t you make the same experiences against all possible dogmas and prejudices, which most of the leading engineers seemed to make for themselves while doing the job you begin with for decades? (I just talk about what I read, not experienced myself)
I don’t say you’ll end up with doing all analog recordings and producing turntables in 2030 😉 but if we talk of pure digital recordings, I’m quite sure, as you found out editing DSD in the analog domain instead of the PCM domain improves sound, you’ll find out that sacrificing the approach of maximum accuracy in the chain thereafter against using e.g. tube mic‘s will improve realism, that mixing in the analog domain instead of the digital seems to boost sound quality (inspite of all theoretical limitations), that analog compression is much less harmful than digital compression, that inspite all glorifying of higher dynamic ranges of formats, even limiting techniques (like your example of added compression or eq) can improve the REAL quality of a recording and so forth.
The good thing is, if you’re really open for all of those experiences, you might be able to produce the best recordings in the digital domain. I just guess, for some more time, this just might have some steps involved, you might have tried to argue against so far.
Jazznut, I think that you have posted a very compelling argument.
How can you “boost” sound quality? How can you “improve” realism? You have a signal. Any modification of that signal is a distortion or a specific modification to please the producer and musician. In the end, we will hear what they want us to hear.
If they found out that plug ins, compression or EQ will make the sound more satisfactory to their ears, so be it. This is what I expect from them. Once they finish, that’s what I want to hear. It would be best if they told me which speakers they used to “approve” the final version, the master. Then, I would hear much closer to what they heard (except for the room, of course).
„boosting“ and „improving“ was just meant in terms of the final result, not a n terms of necessarily adding something to the signal. But as we all know from the work recording and mastering engineers do, it also CAN mean this.
That has been my point all along. Most music is “produced” to sound on two speakers. Producers and engineers tweak it to what the musicians and them like most or prefer. This is the new “real”.
If I understand you correctly I see it in a different way.
If your „old real“ or „the correct real“ is a more or less unmastered, plain and non-influenced recording, then there’s no positioning of performers, no holographic or pin point imaging, hardly any soundstage, no silky string sound etc.
But listened to at home it also wouldn’t have the effortlessness, impact and non-criticality of a live concert in terms of room acoustics.
This usually should sound extremely boring without the optical part of a live performance.
There are different rules for making a recording fun and realistic (but still always artificial) sounding than replicating exactly what the compromised recording of a performance offers.
And remember: a recording is not always done in a concert hall. Mostly it’s a simple room with loads of stuff in it, which is then pimped towards a 3D soundstage experience.
But I am all in with engineers, producers and musicians using the studio and it’s equipment as a tool, as another instrument to play. Some will be better than others.
I think everyone here should get a USB mic and start GarageBand and figure out themselves the artistry and science of the recording process.
Realism in the signal assumes that the feed from the microphone is real, i.e. a perfect analog of the sound field in the recording venue. It’s not, it’s facsimile. It may be a very good facsimile, but still just a facsimile. A skillful recording/mixing engineer can sometimes finesse the recorded microphone feed to more closely match the actual sound field; never perfect, but closer.
But engineers know this and they use this particular issue to play with the recording. They make vocalists use the different mic aspects to get a particular sound. Same with mic’ed instruments.
“If you want something done right, you gotta do it yourself”, springs to mind.
Paul,
As you well know, in France there are many “purity laws” governing the content of food and beverages. The laws governing the content of wine are the most strict. A few years back an respected and award winning vintner was brought before a judge for the crime of mixing grapes from different regions of France. When the defendant was asked by the judge why he felt the need to flaunt France’s purity law, the unrepentant vintner replied “Because it made the wine taste better “.
Some laws and rules are meant to be broken, for the right reasons 😉
French wine laws have very sound socio-economic purpose and are not there to be broken.
I remember discussing the French wine laws with a young recently qualified oenologist working at Chateau de Mercues, a lovely old chateau high on a cliff outside Cahors in South-West France. Their viticulture goes back 2,000 years to Roman days. In that region they mostly grow one red grape, Malbec, and all the wines are at least 70% Malbec (up to 30% Tannat or Merlot). Some of the best are just 5% or 10% Tannat and some are 100% Malbec, which I think they call “Prestige”.
Consider them your pure DSD of wines. They are lovely to drink, but take ages to make (20 years typically), expensive, of limited supply and variety and frankly not many people outside of Cahors have heard of them. Getting more like DSD the more I think about it.
Adding some Tannat or Merlot makes life a lot easier, cheaper and adds greater variety. Think of that as mixing in DXD (PCM) or the analogue domain. Consequently the Prestige wines are rare.
Pure Malbecs from Cahors are very fine, but no one is saying it is the only way to make wine or the best way, or that they make the best wines available. From what I understood, commercially it is a very poor proposition with limited appeal. It is purity for purity’s sake and that’s how they choose to do it. There’s nothing wrong with that, but most other people don’t.
I know another country that has become famous for Malbec. Also, Tannat is another grape that takes forever to mature properly. The Basque took Tannat to Uruguay where it does very well (now). Interestingly, a lot of it grows in sandy terrain. Uruguay’s weather and geology couldn’t be further away from the area where the Basque used to grow it.
They grow Tannat in Paso Robles as well. Tablas Creek, co-owned by the Perrin’s (Chateau de Beaucastel) makes great Tannats. There is another French winemaker that moved to Paso, Stefan Asseo’s winery is called L’Aventure, because he was not allowed to mix grapes the way he wanted in his birth country. Find them, they are truly delicious.
“There are nine and sixty ways of constructing tribal lays,
And every single one of them is right!”
— Rudyard Kipling
In the Neolithic Age
And yet again, the the simple minded formatting algorithm thwarts my desired text layout. Blank spaces matter.
Back in the 60’s and early 70’s a popular college campus activity was one called “piano smashing” appropriately, because the goal was to smash a grand piano into little bits so that could pass through a 4″ opening or tube. First team to complete the task, won. This term is also used as an analogy given the task of “smashing” 32 or 64 individual tracks of audio down to a 2 channel product in the recording studio.
While in the studio, some prefer to make separate tracks individually, and merge to the final product. But wisdom and experience is required when the artists who still prefer the totality of a group of musicians who perform together in the studio, or where ever they may record, and the engineer who knows how to properly mic the room to get the desired results, or those who do live performances.
Indeed! And finding the best position of the microphone in the recording room is at least as time consuming as finding the best position for the speakers in the listening room! Being that time-consuming it’s much easier to perform close-up miking adding some “room” sound and every and play with 24 channels on the mixing console. 🙂 And the type of microphone used us much less important for the resulting sound quality than the microphone’s position!
I couldn’t agree more, Paul.
Getting really old school, here. In the original Sun Studio in Memphis, TN, Sam Phillips would set up a single (presumably) omnidirectional microphone in the center of the room and has ‘X’s taped on the floor indicating where the musicians and/or their instruments should be located for the best sound blend/levels. For an interview with Rolling Stone(?), John Mellencamp, who recorded part of his album No Better Than This at Sun, showed them that the marks are still there.
Hey Paul! Just was on amazon to order the new book,and quite a few reviews saying the actual printing and quality of the book is terrible??? Just a heads up
Yup. That happened on the very first Kindle version where I just uploaded a PDF. I quickly fixed that but the reviews remain. Not much I can do about that. The printed version never had a problem.
Digital (Kindle) vs. analog (printed book).
Paul, just a clarifying question. Given you are not recording for on-air/radio playback or need to worry about the loudness wars, when is it ever useful to use compression?
https://forums.stevehoffman.tv/threads/i-was-asked-why-do-recordings-need-compression-limiting-during-recording-mastering.631903/
The few examples I have seen and worked with were on vocals. Some musicians aren’t that great working the microphone. They’ll pull away at the wrong time or lean in when they shouldn’t. A light touch on the compressor evens out the level of their vocals.
Another thing to consider is, that if you have to reduce level here and there for some reason, it doesn’t necessarily need the use of a compressor (which is a unit, working automatically and implying several losses). Often just gain riding is done.
And this is why mixing engineers get paid more than mastering engineers.
If I may say I think Steven Wilson has grown into a pretty great audio engineer. His 5.1 mixes and his 16 track analog tape mixes are pretty gold.
Anyhow. I don’t take any of those great mixing or mastering engineers for granted. They have a huge skill and I’ve always looked at them as members of the band. A big part of the band. 🙂
I also really like Ian Sheppard and he is a terrific engineer who swears by dynamic range plugins to get the most out of your mixes.
Terrific audio tool.
https://www.meterplugs.com/dynameter
George Martin, the fifth Beatle.
When you take a photograph of something you are only capturing one aspect of it. Even if you add another photograph at the same time from a different angle to create a so called 3D image you still have only one aspect of it. If another photographer photographed the same image at the same time you’d get a different photograph. Give him a different camera, different lenses, and different film and the two images, yours and his would be very different from each other. Neither is “right.” Nether capture the essence of what the image is. It can only be associated in your mind with memories of aspects of similar objects. For someone who never experienced the seashore a photograph of it would be meaningless. If someone didn’t tell him what it is he wouldn’t even be able to guess having had no prior experience with it. If all he ever had seen was photographs of it he might recognize it as a picture of a seashore but that’s all.
Then there is context. When you experience a real something there is the environment it is in. A tree in a forest is in a different context from one on a sidewalk or a backyard. Lighting is different, its surroundings are different. Photographs, even panoramic photographs can’t capture that.
The closest we can come to duplicating a visual image IMO is a planetarium. It is not a photograph, it is a reconstructions based on thousands of photographs. It recreates the full context. Even if the stars were not in the right positions scientifically the sense of it would be the same. If there were large patchy black areas where there was nothing it wouldn’t work.
This is an analogous to the problem of recreating a sound. Having a good camera, lens, sharply focused, accurate colors for even just one image is not nearly enough to recreate the experience. All of the other factors must be addressed, if not with scientific precision then at least the essence of the source and the context which is the acoustic environment that has a major impact on the way your brain perceives it. Can you do that? Can your industry? IMO they’ve hardly scratched the surface. Funny how you aim five speakers directly at yourself from five different directions and call it surround sound. Does it work? Most audiophiles would say not only doesn’t it work, it makes things worse.
Great post, Soundmind.
Got me thinking about the analogous reference that Alan Parsons said about the potter (mixing engineer) who molds the clay (music mix.)
In any case, like photography if you put too much spice on the taco or put too many things on it the shell will ultimately crack and could possibly taste like shit. Lol. 🙂
“Reality! What a concept!” — Robin Williams
Exactly! Memories are holographic patterns stored in neural networks that physically and topographically mirror the reality they are storing. Although vision is 2.05 dimensional, hearing, haptic, and proprieceptive senses are true three dimensional and all senses are integrated into a 3D map of the external reality.
Vision is much easier to fool, but if you spend thousands of hours listening to 1.1 dimensional two channel audio (mixed and panned), you can develop a delusion of hearing an ‘image”. Hearing also develops illusions in the presence of background masking noises like motor humming, gears grinding, tires rolling and turbulent fan driven air flow.
I have never mixed down a recording, but I do have some ‘front of house’ experience’ (I know, apples and oranges) for my church. I set up the mics and d.i.boxes for the performers (well, there were three overhead mikes always hung above the choir and a mic bar set up for the piano) and the stage monitors. The drum kit was not miked, just have to trust the drummer to show restraint and not go full tilt Keith Moon. It was all mixed down to mono in analog, greatly simplifying things. The main thing was to get the balances right (whatever that is, mixer’s judgement call) and I seldom touched the 4-band equalization on the board. However, occasionally I did finesse things a bit, again mixer’s judgement call, but always with a light touch. Their purpose was to make the joyful noise unto the Lord; my job was to get that joyful noise (actually we had some real talent available) out to the congregation as best I could, but mostly stay out of the way.
MIXING IS DISTORTION
(and every other knob too!)
I started recording seriously in 1973, and was always searching for the purity. I had temporary infatuations with studio artistry, but even having musicians in an isolated room is artificial. To get a great performance from the heart – and not just intellectually “perfect” – you need an audience. The problem with that is the PA system.
My favorite recordings were live, and classic two mic live recordings of acoustic performance were the closest to purity. I learned how to work a room, to modify a room for playback, and how to tell which room fit the music (yes, I used to clap my hands) – but there was still something missing, and something that distorted: PA speakers and mixing consoles.
The standard model of hearing and theory of stereo did not fit what I heard. Pan pots didn’t work, artificial reverb didn’t work. I could get solo tracks sounding good; but the more added to the mix and processing the tracks to fit them together like a watchmaker, the less clarity was left. In fact, a watch is a failed metaphor. This was like making a stew when I was searching for Nouvelle Cuisine or sushi. Japanese art of food emphasizes individual bites of contrasting colors, textures and flavors – this is how music sounds in real chamber ensembles, with the individual flavors of flute, oboe, piano, and violin clearly delineated even when they are blended together in the score.
I got a few clues when I discovered the time distortion of bass speakers, the spatial and spectral distortion of diffraction and the spectral contamination of Doppler distortion, so I started building speaker prototypes to fix those flaws. I had an attic TV room and was having trouble understanding dialog on the IFC channel because of too many close echos smearing consonants (and indie elocution or lack thereof), so I looked into directional control.
While doing this I discovered that human voice projects a 90 degree cone throughout the midrange, is wider in the bass and narrower in the treble. The dividing points are similar to the corner frequencies of the ear mechanism, ~400Hz and ~4KHz. These are prominent in the loudness curves. Bass and treble extension is by what our host calls the “internal DSP”, neural circuit compensation.
I built a speaker with a cardioid midrange, omni woofer (bipole) and super-cardioid tweeter. This not only made the speech clearer in my TV room, it made the reproduction of speech match the polar pattern of acoustic speech, and it sounded more natural. Spatial reproduction accuracy, WHAT A CONCEPT!
That spatial pattern doesn’t fit anything else, but that was fine because it was intended as a center channel speaker for movie and TV. ITU mixing puts the dialog in the center, the soundtrack music in the FL and FR and the sound effects wherever.
It was about this time that Dr. Manfred Schroeder told me why my musician friends don’t hear phantom center, a psycho-acoustic phenomenon I share with them. We hear where sound sources are by correlating and triangulating the room echoes, which works for acoustic instruments but not conventional speakers. This means I can hear where my speakers are, and the stereo illusion collapses.
Putting more than one microphone into a single speaker is SPATIAL DISTORTION. So is splitting one microphone into more than one speaker (panning). As my brain was re-calculating everything I know about audio production, it became clear what I needed: audio systems with no mixing. For every mic, there is one speaker. For every speaker, there is one mic. This is how close miked multi-tracks are supposed to work.
Easy to say, hard to do. First, you need to make a consort of speakers to support the timbral, temporal, transient and spatial characteristics of every instrument you seek to reproduce; and at every angle, the frequency and phase response had to be flat – in fact it requires waveform coherence. They need to be suspended at the same height as the instrument, and in the approximate position. This poses a challenge in gain-before-feedback with live amplification.
The acoustics have to support a wide range of orchestrations and musical styles. The good part is since the instruments and speakers are acoustically congruent, if you get it right for real chamber ensembles, it works for amplified ones too so there are good models to follow.
BUT, one size, shape and acoustic treatment do NOT fit all. I had to learn how to use acoustic mixing and balancing, acoustic EQ, acoustic compression and acoustic reverb to replace every knob in a recording studio for the capture mic pair in the audience. If you can do that, then you can get a sound that translates to the most number of playback situations with the live energy and the absolute clarity of ZERO KNOB recording.
Knobs all create TIME and TRANSIENT DISTORTION. Anything that modifies the sound, be it equalization, compression, or reverb, must feed back a delayed version of the signal and mix it in. This inherently smears time and waveforms in a way that does not occur in acoustics.
Nothing else can satisfy ears used to sitting on the stage or in the front rows. Everything else is second class.
This is why I love DSD. You can’t do math on a one bit signal. So Paul and Gus, every time you reach for a knob, you have to decimate first and then you lose the time resolution and coherence of DSF. Every knob distorts phase, and the corresponding spatial cues. In fact, modern recording techniques and speaker design deliberately smear spatial cues! AFAIK, the systems which are described by audiophiles and reviewers as having “pin-point imaging” have substantial cepstral distortion artifacts to support the delusion of 2 channel stereo! So do most room pictures I see, with bare walls, floor and ceiling at the first reflection points.
But often people will pay more for a deliberate illusion then for the truth. I was never satisfied by a book telling me that heaven awaits me – I want to drink the Ambrosia while I still live.