Ringside seating
Join Our Community Subscribe to Paul's PostsOn most recordings, there is a combination of close and distant miking. The close miking gets us a closer-than-natural intimate view of the instrument or performer, while the distant microphones add the missing ambiance and space close-miking cannot capture.
What’s odd about this miking technique is that it works despite the fact we are never as close to the instruments as the microphones.
One way to think about this is to visualize actual musicians in the room. Let’s use a single cello in our example. Mentally place the cellist a few feet behind the loudspeakers. Now, close your eyes and imagine how that would sound from your listening seat.
What you are hearing is a combination of the direct sound from the bow and string coupled with the room’s interactions.
Now, mentally replace our imagined performer with the close-miked cello. It sounds “the same” because the distance between the speaker’s rendition of the close-miked sound and the listener mirrors the distance between our imagined performer and where we’re sitting.
It may seem counter intuitive to place microphones closer than our ears ever go, but that’s how we get musicians in our rooms.
I think that’s exactly why close mic’ing works so-so well for single or few instruments for which the relatively close listener-speaker distance halfway matches a possible real situation (you’re sitting 5-10ft away from a performer in a very! small concert room).
It doesn’t work well for performers, which usually have much larger distance from the audience (orchestras, choirs etc.)…and to be honest…even for chamber music it doesn’t work so well, as we usually sit away 30ft from the performers in reality.
So I understand the logic you pointed out, but I think the reality check of distances makes clear why distant mic’ed or one point recordings have their charm and why close mic’ing, even when supported by distant mic’s, has an artificial component, we’re not used to from live.
From a live orchestra we hear the string sound from 30-60ft away, damped by the audience and the concert room. How should this match the directly pick’ed up, close mic’ed sound, of which we’re away just our 5-10ft speaker distance.
The closest approach to what we’re hearing in our listening room probably is, to imagine, we miniaturized the orchestra, so it fits in our room (plus several feet for the expanded soundstage out of room walls). Then the back performers still sit further away than our listener/speaker distance, but at least it halfway sounds like a (distance-wise) miniaturized orchestra then (just not as a real one).
I think what we proved today is, how different our Hi-Fi listening is in fact from live listening.
Personally I think that the closer the mic the more detail is captured & that better capturing of detail is paramount for me.
Realistic, 3D imaging due to well placed mic’s is nice, however with 60yo ears, more detail is gonna give me more listening enjoyment.
I can decide how far from my loudspeakers I want to sit.
I regularly go (went) to live broadcasts at the Wigmore Hall and they use two microphones about 15 feet in the air above the front middle of the small stage.
I understood that it is very much an art dependent on a myriad of things, but one thing you can tell is a recording that is too close and not enough room. The most important factor seems to be the venue, and most classical recordings seem tone done in relatively large venues and to a much lesser extent in recording studios.
Contemporary requirements often lead to recordings, of which the reproduction qualities may be good, have no real basis to judge the stereo qualities of a high end audio system, or indeed, its sound quality! This is particularly applicable to multi-microphone and/or multi track recording techniques where the final result is usually a combination of both the producer’s and engineer’s artistic temperament and may bare no resemblance to anything that may be heard at a live performance.
In most cases, live classical broadcast sources will give the most accuracy, which usually entail 2 microphones ( hopefully but not much anymore- figure eight cross paired , with another microphone placed in a particular position with respect to the back of the orchestra. ( BBC practice since the sixties ; not sure how much today).
It is practically impossible, of course, to judge the reproduction of an audio system on multi- track , purposely manipulated and engineered source material as in this respect—there is no reference from which to start ( which is why judging your system on MOST….. rock, electronic and other types of popular music is laughable from the get go!!).
I agree entirely and never use any amplified or synthetic music for evaluation purposes. It doesn’t stop me listening to electro and amplified music etc.
The BBC has a lot of favoured venues with fine acoustics, one of the more famous being the old Kingsway Hall, where FFRR (full frequency recording) started, the old orchestral studios at Maida Vale and at the ultra-modern MediaCityUK in Manchester https://www.dock10.co.uk/thestudios/, where the orchestral studio is 6,400 sq.ft. and seats an audience of 250, which must improve the acoustic as well. One of the best orchestral recordings of 2019 (Shostakovich 11, Storgards, BBC Phil) was recorded there and issued by Chandos on 24/96 and SACD multi-channel (what SACD was intended for).
The top instrumental recording of 2019 was Levit’s Beethoven at the Reitstadel, Neumarkt, a 500-year-old building rebuilt as a 450-seat concert hall and acoustics purpose-built for recording.
In comparison, smaller recording studios often seem to produce recordings that are dry and lacking ambience.
London is lucky because the suburbs were built in the latter part of the 19th century when the mass transit system was installed, and philanthropists built numerous large vacuous brick churches that are excellent for recording and very cheap to hire. Some have even been turned into recording studios, like Air Studios.
I’m not sure I agree with Paul’s premise, as during lockdown I have listened to live performances sitting very close to the performer. The last recital I was following the score from about 10 feet behind the pianist. You still have the full sound of the room, it is not like a sterile close-mic studio recording.
Indeed- some of the older Kingsway Hall Decca recordings ( many of which were engineered by Kenneth Wilkenson) are most magnificent, to say the least! ( find some at: Pro Studio Masters) .
Do you have a link to:
recording of 2019 was Levit’s Beethoven at the Reitstadel, Neumarkt, ??
Thanks
It’s the Igor Levit Beethoven piano sonatas cycle. Most recorded at the Reitstadel, some here https://www.funkhaus-berlin.net/p/studio-1.html where he’s done quite a few of his other recordings. Looks pretty fantastic.
Thanks so much
Howard
I think it is the Figure 8 microphones that are used for a diffused bi directional sound that gives you that rear feeling. You can set the mic’s condenser that way.
This example also makes clear why a stereo system only can create a most artificial sound – non-live. The recording from the second mic adds concert hall ambience to the omni present listening room ambience. And the second microphone catches information being delayed compared to the first nearfield microphone – while a listener in the audience doesn’t face this phase delay issues. Not to mention the problem with mic arrays where each microphone catches different ambience information – a real mess. No wonder that a mixing console with many tools is required for getting an acceptable sound. Some recording techniques seem to be a job-protection strategy for sound engineers! 🙂
Whichever way the microphones are implanted in a live event, it is not enough for the auditory mechanism to register it as a convincing duplication of this event, this lack of correspondence being even more evident the more numerous the orchestral formation is.
If to this is added the impossibility of current technology to convincingly capture the real timbres of the instruments and in many cases the tonal balance and mainly the dynamic contrast of a symphony orchestra, we will necessarily have to admit that even with the advancement of digital technique , there is currently an unbridgeable distance between the actual event and the facsimile that can be achieved at home.
I have always maintained this position in this space, but at the same time I believe that for the moment and in consideration of the above, what we should only care about is that our domestic equipment gives us the most adequate satisfaction while we listen to it and that let us banish from our mind any attempt to compare it to the actual event, this for peace of mind.
This position, far from being conformist, compared to the current moment, is realistic and pragmatic, since no matter how much we spend on a system, we will always be facing the reality described here.
Let the great scientists and highly specialized technicians achieve a true revolution in technology that mainly encompasses the recording chain, since current attempts in the reproduction chain, no matter how complex they may seem, only try to “manipulate” our hearing. .
With all the potential problems mentioned here a newcomer, audiophile or not, may well ask “why do we bother?”. However, despite the limitations, when we do bother the results can be incredible, fantastic and most enjoyable. We don’t recreate the actual event in our homes, in all honesty do we really want that. Taken literally it would be ridiculous, but we manage a more than acceptable facsimile. Is it a case of near enough is good enough? That’s up to the judgement of the listener but it’s the best we’ve got….at the moment.
It’s actually very simple to make an accurate sound recording and has been since the 1930s. The problem is that our other senses help us to focus the experience while audio without the other senses is plagued by distractions from noise in the environment. Today’s additional issue is that musicians want to be able to fix their performances unlike what was possible 60 years ago.
Sad to say, at this point we have generations who have never experienced live unamplified music. High fidelity is utterly meaningless to most people today. Hopefully some in the younger generation will pick up the ball and start performing acoustically. It is a massive opportunity.
Normal microphones, whether omni, cardioid or hypercardioid, do not distinguish between direct and reflected sounds like our human ear/brain hearing system does. Whether capturing an individual instrument or an orchestra with chorus (or the dialog and effects for movies), microphones need to be placed much closer to the source than a pair of ears in order to approximate the subjective sound that a brain connected to those ears would perceive. Our brains are able to separate and locate the reflected sounds of the venue — or the natural sounds from all around in the case of outdoor location recording for film — and latch onto the direct sound.
Mics don’t hear like brains, but a mic placed close to the source sounds more like what we hear seated farther away. It’s not a great approximation, but it’s acceptable. And recording engineers often take it to extremes for a variety of reasons having to do with post-production. Reverb and atmosphere can be added in the mix to bring life back to a dry recording, but an overly reverberant (wet) recording cannot easily be fixed. Digital processing to remove reverb in a too-distant recording will quickly degrade the sound.
Doing my own recording, like The Matrix, opened up a whole world for me and I can’t “unhear” what I found. Microphones don’t “hear” like we do and never will. What we perceive is processed and projected onto our environment, just like vision. What we experience is not “real” in the empirical sense. Getting a good recording is truly an art, and an elusive one at that. I am building a pair of mandolins currently so I searched the interwebs for just the right formula for making one that sounds great. What I found was that the big names in this field say something like this, “I build a bunch of them and a couple of them sound fantastic, the rest OK or duds.” A new room for an engineer requires a combination of skill and luck as well. In my case all luck. Even “pure” recordings are likely processed with some measure of compression and EQ to get them to sound closer to what we expect. A good mixing engineer can combine all sorts of mic placements to produce some surprisingly good results. My favorites are usually stereo pairs with the room added but I can’t deny that there are some combos of close and distant mics that sound darn good in a concert hall setting. Studio work can often be heavily processed by necessity, unless you have a great room. On the other hand, Rudy Van Gelder got stunning results in his living room. Go figure.
If it were not for Paul’s posts where would one get such bits of information unless one was into recording. Thanks for the immense service you provide. Regards.
Ever since I first listened to some of “Paul’s Picks” I had the question: Why don’t all recordings sound like this? Like his recommended Nils Lofgren track. This helps answer, that it’s a tricky process and indeed as much art as science. One of my early classical recordings was the Beethoven piano concertos by Leon Fleisher with the Cleveland Orchestra. The piano was great, but the orchestra always sounded to me like it was behind a curtain. Like Paul often says, it’s a tradeoff.
Many musicians know that often times hearing the instrument up close, like a close mic… will sound bad. Drummers do not tune their drums with their ears inches away like a close mic. Matter of fact…. its understood in a live situation that drums will begin to sound their best around twenty feet away.. For what its worth.
At the risk of being told that I’ve posted the same thing in other terms here countless times before I’ll post it again. If you don’t want to read it again don’t go any further. Move along, move along, nothing new to see here.
In 1974 I solved this problem in my own unique way using standard analytic tools I had acquired in school that evidently had never been applied to this problem in this way. Knowing that room acoustics plays such an important role in what we hear, my goal was to understand exactly what room acoustics does to sound. This is applicable to any size and shape of room made of any materials. Another goal was to see if it was possible to engineer the results of the acoustic effects in a very large room, say a concert hall in a small room in my house from a recording.
Here are some of the tools I used.
Vector Calculus
Fourier analysis
Mapping functions
Field theory (tools from both electromagnetic and fluid dynamic fields)
Material science (understanding what the acoustic reflective properties of materials does to sound)
Here is the methodology I used
Understand the sound field propagated at one point in space
Understand the resulting sound field arriving at another point in space
Understand the relationship between the two
I call the mathematical relationship Acoustic Energy Field Transfer Theory.
Sound as affected by acoustics is a time varying fluid dynamic field in a partially or totally enclosed space. The working fluid is air.
So far I have found only one other approach that works, Wave field Synthesis.
It seems to arrive at an equivalent conclusion through a different path.
The tools it starts with are
Huygens-Fresnel principle which states that any wavefront can be regarded as a superposition of elementary spherical waves. Therefore, any wavefront can be synthesized from such elementary waves.
Kirchhoff-Helmholtz integral
https://en.wikipedia.org/wiki/Wave_field_synthesis
The results of the two methods are interchangeable in either of two ways, solving the AEFTT equation for every point on a sphere using multiple sources of sound solved for one source at one point on the sphere at a time and then using the principle of superposition, that is vector addition of them for multiple sources for each point on a sphere, or by expanding the arrival point of the vectors by projecting them backwards in time and expanding them in space to where they intercepted the sphere in WFS. Either way works. The reason they are the same at any point within the sphere is based on the geometric principle of congruence.
When fully exploited to their ultimate potential both systems would look the same and result in an enormous number of speakers, amplifiers, and electronic circuits that would include time delays, equalizers, and mixers by the thousands. The hard wired signal processing could be replaced by a computer simulating the results of the equation through emulation software, a kind of super digital signal processing through computer programming.
In either case practical prototypes of either method require drastic compromises. WFS is usually demonstrated with hundreds of speakers shoulder to shoulder in a horizontal 360 array aimed directly at the audience. It completely eliminates simulation of the vertical components of the field. It has not been adapted for commercial use and certainly not for commercial recordings. It requires an anechoic source, either a recording made in an anechoic chamber or live performers in an anechoic chamber.
Environmental Acoustic Simulation, the practical application of AEFTT makes different compromises. it uses multiple small speakers around the perimeter of the room which direct their sound at the walls and ceiling which act as diffusers. The recording has to first be normalized, that is equalized to get its sound flat back to the microphones. This is fed to the two main stereo speakers emulating the direct arriving sound and early reflections on the recording. The other speakers are fed with a processed signal that reconstructs the elements of the reverberant field by having the same relationship to the direct field as they would have in the large space. The result of combining the two fields is recreation by reconstruction instead of reproduction. Unlike audiophile sound systems it is based on the observation that the recording is far from perfect capturing only a portion of what the audience would hear. The missing parts must be recreated by reconstructing what is missing from what the recording has. Each recording requires its own unique settings and there is no one right setting for any of them if for no other reason than that there is no data for the live space. Due to variables in the recordings and playback room acoustics the term “accurate” is therefore meaningless. Convincing is a much more appropriate way to describe the results.
The two systems use a different methodology to determine the acoustics of the live space. WFS uses an impulse as the source and measures the response by an array of directional microphones. The method doesn’t work because it cannot segregate the time of arrival of delays from the spectral change to them as a function of time. This is a critical parameter. For this reason measurement methodology was the most frustrating part of my entire invention and I nearly gave it up entirely. Then some weeks later I figured out a way to measure the effect one frequency at a time. Ironically it has something in common with the method Wallace Sabin the father of acoustic science used and a simple trick in calculus. It’s far more time consuming and tedious than the WFS method but it accounts accurately for every reflection from every direction.
Once I had access to the internet I looked to see how professionals in acoustics understood their science. Theirs is a collection of a zoo of different parameters each trying to describe one aspect of an entire room with one number. I guess you are supposed to put them all together to gain a comprehensive understanding but IMO if you do, you will come out even more confused than before you began.
That’s a lot of rhetoric SM for a “music playback” system that’s primarily comprised of a $99 JVC CD player feeding a 1970’s HK Citation 11 preamplifier, multi-band graphic equalizer, amplified through primary speakers and a couple dozen circular-array piezoelectric tweeters driven by god only knows which multi-channel amplifier modules processed through a couple – Audio Pulse Model 1 – inspired delay circuits. Is this your idea of high-end audio?
With your level of imaginative prose perhaps consider being a tad bit more respectful when you speak of John Atkinson or Michael Fremer waxing rhapsodically about the music performance and/or measurements of modern high-end audio components.
But in the end, if your little experiment brings you pleasure who am i are we to judge.
First of all the CD player was $199. There are no piezoelectric tweeters in the system. I rejected the Audiopulse 1 as unaceptable and not suited for its functional purpose in the system design.
Fremer and Atkinson know what they like and that is about all. Fremer is a particularly detestable man IMO and Atkinson is nothing more than a glorified bench tech. While I have no personal grievance against Atkinson I have no respect for his knowledge either. But I don’t stop there. I extend that assessment to everyone in this industry. It’s not based on personal dislike or envy, it’s based on lack of real knowledge and perpetual failure. Worst of all they aren’t even aware that they have failed. At least Paul is aware of it. He proved it when he compared the sound from the best collaborate efforts of the lifetimes of his team with the sound he heard at the New York Metropolitan Opera and judged the results of the team as “canned music.” If that doesn’t scream failure then what does?
If microphones captured audio waves perfectly (which none do), the most realistic solution would be a single pair of omni-directional microphones (one left and one right) spaced apart the same as the human ears with a baffle between them the same shape and consistency as the human head. This is the concept behind the Jecklin Disk microphone technique. The microphones would be positioned at the same distance from the source as the recording engineer wants the listener to perceive it. But microphones are not perfect. As good as some are, they do not receive and register sound as precisely and dynamically as human ears. Some capture the tone and soul of instruments better, usually up close, and others capture the bloom of the instruments and ambience of the space, usually at a distance. So, a combination of close directional and distant omni-directional microphones often (but not always) yields the best result, taking advantage of the strengths and weaknesses of each microphone type. On Chesky’s Ultimate Demonstration Disc, one beautiful holographic recording of a choir and organ in a cathedral was made by suspending a single pair of omni-directional microphones high above the floor in the nave. There are no close mics. On a good audio system the realism is stunning.
It won’t work. It will not be convincing. Sound fields are vector fields. Every sound from the source and from the reflections that are most of what reaches your ears has a direction of arrival. For the direct sound this is how you tell where the sound is coming from. It’s almost but not quite a point source. There are few exceptions, grand pianos and pipe organs. But individual instruments each have a single perceived point of origin. The apparent source width (ASW in acoustician speak) is larger for these larger instruments but even each voice in a choir has a point of origin. The reverberant field by contrast comes from EVERY DIRECTION in such rapid fire succession and even the loudest of them indistinguishable from the rest that you shouldn’t be able to tell where it is coming from. The difference between the reflections coming from the left and right control what is called the binaural quality index (BQI in acoustician speak.) This determines the perception of lateral space. reflections from above and behind as well as left and right determine Listening envelopment (LE or LEV in acoustician speak.) The requirements for the speakers to reproduce these two different kinds of sounds are very different.
The microphones turn the vector fields that land on them into scalar electrical signals losing all of their directional properties. The speakers turn them back into vector fields which have absolutely nothing in common with the live field that landed on the microphones and was recorded. So far there is no known way to arrange microphones so that they capture the direct and reflected field independently. These are among the reasons why quadraphonic sound and its descendants don’t work. When all of the reflected sound comes through the same speakers as the direct sound, if there is much of it, it sounds like the source was inside the Holland Tunnel and you are outside listening to it. Ralph Glassgal calls it “the sewer pipe effect.” Not only does quad sound get the direct sound coming from all the speakers but it beams its sound from easily identifiable points, the first obvious unacceptable flaw I noticed about it. The reason binaural sound played through headphones doesn’t work is that when you turn your head the sound turns with it. To your brain that is the equivalent of two scalars and it concludes that the sound is coming from inside your head. This was known since at least the early 1960s when I read it and probably long before.
Chesky, with very few others around that level of quality are the best imo. If one would spy what they do, he wouldn’t have to think much further. E.g. DSD or PCM format is completely irrelevant in comparison imo.
The downside? They don’t sound very direct and dynamic (just as live music from the usual distance)
Some of the tracks on the Chesky Demo Disk sound like the performers are right in front of me, living and breathing in my room. Some have excellent layering of instruments close and far, right and left on the soundstage. Some focus on transients and dynamics. For me it was just an easy way to get a disk with tracks emphasizing different sound qualities that I could become very familiar with and use for evaluating different gear and cables.
I love their airy sound. And it’s a mystery why so few labels achieve this. Some techniques seems have been kept as quite a secret or are expensive. Would be so interesting to talk with Chesky about the influence of mic’ing, recording generally, mastering and equipment.
Haven’t heard anyone mention the Decca Tree yet 😉 I experimented with Jecklin discs many years back and ended up personally preferring just spaced omnis. I own quite a mix of microphones and have had success with a number of commercial CD’s. Every venue is so very different and placing microphones as most here know comes with personal experience. (I oddly walk around placing mics with a finger in one ear as mics are generally mono).
Some venues have been quite challenging like this example with a very large audience that required professional sound reinforcement which complicates matters a lot
“Nairn International Jazz Festival” https://ostwaldjazz.com/single/2715/nairn-international-jazz-festival
Being sat up high behind the audience once the performance starts you are committed of course thats why sound checks are so very valuable. I watched one musician move one of my recording microphones mistakenly thinking it was a PA microphone but all was not lost.
I loved doing all of that but nowadays I’m far too old to shift heavy recording gear but I loved it all while it lasted.
SM, I disagree. It doesn’t take a scientist to tell me if it works or not, or whether it is convincing. My own experience is sufficient. The illusion created by two channel stereo using high quality mics and good playback gear is good enough for me and 99% of listeners. Omni microphones capture and sum sound waves from all directions and convert them, as you say, into scalars. But isn’t that what the human ear hammer, anvil and stirrup do? Or do you believe the hammer, anvil and stirrup transmit discrete vectored signals without any mixing? Doesn’t the brain process the summed waveform of all the waves entering each ear, comparing the waveforms from the two ears and analyzing the wave pattern for minute differences and changes over time to extract frequency and spatial information? I could be wrong, but whether or not I understand how the ear and brain works doesn’t change how I perceive the sound. Regarding headphones, I own some very good ones and I have not had the sense that the sound is coming from inside my head. My headphone experiences have been very positive, in fact awesome. With them I feel I am realistically in the three-dimensional space, expansive in all directions. It does not sound artificial to me. The only thing I don’t like about headphones is the physical pressure of the headset on my head and the limitation on my movement.
“But isn’t that what the human ear hammer, anvil and stirrup do?”
Yes I agree that this is true.
“Doesn’t the brain process the summed waveform of all the waves entering each ear, comparing the waveforms from the two ears and analyzing the wave pattern for minute differences and changes over time to extract frequency and spatial information?”
That’s not quite a complete explanation and it’s not so simple. When you turn your head even slightly the time of arrival of the same sounds at your two ears changes. What has been used in the past louder, sooner, HRTF to explain directional detection is wrong and the failure of binaural recordings played through headphones proves it. Instead the middle or inner ear, the cochlea which is next to your organ of hearing tells your brain the change in the position of your head and your brain compares that the change of time of arrival of the vectors to the change in the position of your head. This is how you really determine the sense of direction, of the source and the perception of space, the change in time of arrival between your two ears with the change in position of your head remembered for a few microseconds. The head movements were noted by the late neurologist Oliver Sacks who wrote the book “Musicophilia.” This is similar to the way the Disney invented multiplane camera creates the illusion of three dimensions from a two dimensional image.
https://en.wikipedia.org/wiki/Multiplane_camera
It is the persistence of an image in your brain for a few microseconds as the different layers move at different rates using the change in parallax of your two eyes that informs your brain of perceived differences in distance even though the transparent cells are the same distance away.
The required difference in time of arrival between your two ears is from two to five microseconds. It works in all three planes and it is how you know whether a sound is in front of you or behind you.
If you look back in Copper Magazine in a visit to Edgar Chouieri’s lab at Princeton University he recently discovered this fact and put it to use. The binaural recording is played through headphones as usual but both signals are time delayed. A head tracking camera senses the movement of your head and creates a corresponding increase in delay to one headphone and a decrease in delay in the other. This creates the illusion that the sound you are hearing is in front of you. If he reverses the ear which gets less delay and the one that gets more delay it would appear to be behind you. He can only do this though in one plane, the horizontal plane. The ability to sense the direction of sound in all three planes is a strategy for survival that has evolved in all two eared animals over billions of years. When you hear a sound, your instinct is to turn your eyes in the direction it is coming from to see what it is. How does your brain know? It computes the angle you need to point your eyes at so that the sound arrives at both ears at precisely the same time.
I could have told Choueiri this 47 years ago when I figured it out. Why did I need to know this? Because to convincingly reconstruct the reverberant field I had to beat the ear/brain ability to sense the direction it was coming from. As I posted above this was the first fatal flaw I noticed immediately in quadraphonic sound. As soon as I heard it, I know that the guys who invented it and hyped it didn’t know shit. At that point in time neither did I. I also didn’t know that it wouldn’t be long before I figured it out for myself.
If you are happy and satisfied with what you have all I can say is enjoy it. Unfortunately for me I wasn’t satisfied at all. I didn’t go through all of this trouble for the fun of it. Oh wait a minute, I’m getting a message from the ether. Oops, I did have a lot of fun with it and I still do. BTW, when correctly adjusted for a recording, the sound I hear from my prototype for me is stunningly beautiful to listen to. And on top of it all I have all the bass slam I could ask for as well.
I think even with a two-channel two-speaker stereo system you hear variations in the sound when you move your head, at least I do in my system. When you think about it, what are your two ears but two microphones, each receiving waves from two speakers playing sounds recorded by two microphones. The two microphones that recorded the sound are analogous to the two holes of your ears. The summed sound information the mics pick up should be similar to the summed information your ears would detect. I think your multi-speaker system with delays might be adding an extra dimension that the original venue might not have had, or you are compensating for the flaws in the recording microphones or room set up. You are, I gather from your previous posts, also sometimes programming your system to alter the recorded acoustics to a different acoustic that suits your fancy. In my opinion a well built audio system with an excellent two-channel recording should be able to convey the sound of the original room or hall without a lot of extra gear and steps. I do see where one or more pairs of speakers beside and/or behind the listening position can in some circumstances add realism by enveloping you with sound, especially if the side and rear walls of your listening room do not provide the right sound reflectivity for the acoustic you want.
I am not poo-pooing your system. I have a 12-channel, 35-driver audio system with adjustable channel delays that I use with my digital pipe organ, so I know in principle what you are talking about. Vectors are flying everywhere! Even in that system, when I move my head I hear the changes in sound from individual pipe voices emitted from a mere pair of speakers. I don’t use vector calculus, Fourier analysis or field theory. I just arrange the speakers and delays to sound the most realistic to my ears and enhance the recorded acoustics. But even though my big system amazes me with its multi-point sound wave emission, I am equally impressed with the spatial illusion my 2-channel system creates with fine recordings. The 2-channel system can get me almost to the level of the multi-channel system. The multi-channel system has more power and gets me closer to the pipes, whereas the 2-channel system presents the sound as though I am farther away from the pipes.
I too dislike the clamping pressure and the blocking of heat transfer caused by most headphones. But there are some fine headphones designs out there since decades (Jecklin Float, AKG K1000 (AKG offered individual HRTF correction some 30 years ago), LB acoustics Mysphere 3.1 and recently RAAL. As far as I understand the approach of Prof. Edgar Choueiri, Princeton, his BACCH2 SP system also features HRTF compensation!
As it applies to live acoustic music recordings in a music hall, Keith Johnson and Peter McGrath are the two most experienced recording engineers that i know. Every application is as unique as the score, musicians and the room. Capturing the performance requires extensive skill, a flexible toolbox and an ear for mastering.
These two articles provide insight to their various techniques.
https://www.theabsolutesound.com/articles/magic-at-mechanics
https://www.stereophile.com/content/capturing-it-live-peter-mcgrath
As an amateur bootlegger throughout the 70’s – 90’s (for our own personal consumption) we seldom had the luxury of optimal microphone placement in the room. The best recordings (indoors and out) were always up front, ahead of the soundboard and occasionally the front row in the first balcony if a halls acoustics were magical.