(Copper Issue 165 featured Part One of our coverage of the Audio Engineering Society AES Europe Spring 2022 convention from The Netherlands. Part One covered presentations on the use of analog vs. digital equipment during audio mixing; the trend towards mobile audio wearables; and the challenges faced by engineers in the archiving and restoration of audio from analog disc recordings. Part Two continues with some academic research studies conducted by AES members, and a look at car stereo system designs.)
Perceptual Optimization of Hybrid Stereo Width Control Method Compared with Loudspeakers Reproduction
One of the biggest trends in the audiophile and engineering arenas is the explosion of interest in immersive sound. The early quadraphonic and surround sound efforts from the 1970s and ’80s now seem crude, thanks to digital technology, and immersive audio is becoming more and more accessible in both the pro and consumer markets.
However, while systems that can reproduce audio content mixed for multichannel immersive sound listening are now within the reach of many, the vast majority of listeners, as well as a large percentage of audiophiles, still do the bulk of their listening in stereo, on speakers or via headphones. This poses a number of challenges: how does immersive content successfully collapse into a hybrid stereo program with a minimum of detail loss due to inverted phase relationships, inter-channel crosstalk and midrange EQ overcrowding? What does a successful stereo mixdown entail?
Presented by audio research scholar Mitsunori Mizumachi (of the Advanced Telecommunications Research Institute International in Kyushu, Japan), who conducted the study with colleagues Yui Ueno and Toshiharu Horiuchi, Perceptual Optimization of Hybrid Stereo Width Control Method Compared with Loudspeakers Reproduction attempted to address these hurdles and explain some of the acoustical science behind them.
Head-Related Impulse Response (HRIR) relates to how a sound is captured by the ears. As a sound is heard by the listener, their ears and head transform the sound, which provides localization cues. The process can be mathematically modeled, and HRIR is one method that can be used to create simulated binaural renderings. However, the process, which often involves head and torso simulators for recording, suffers from the deterioration of localization performance.
The researchers found that deployment of a hybrid stereo width control, which adds a monophonic signal into the original stereo sources to shrink or widen the perception of stereo width, could characterize spatial attributes to individual sound sources, while HRIR could handle the overall soundstage spatial impression. The use of the hybrid stereo width control to adjust amplitude and phase-adjusted crosstalk elements was subject to meticulous tweaking, especially for different music genres. The researchers’ findings were predicated on perceptually optimized parameters, as opposed to amplitude measurements or other objective data.
There were 11 participants in the perceptual optimization listening tests whose ages ranged from 21 to 24. All of them were pre-tested as having normal hearing. The stereo width control was calibrated in 11 steps. The control settings ranged from 0 to 90 degrees in stepped increments.
The test subjects were asked to compare between the width control methods and select the sound source with the perceived widest stereo range. The classical music chosen for the listening tests were:
- Violin Concerto in D major, Op. 35 (Tchaikovsky) by Anne Sophie-Mutter and Wiener Philharmoniker (Deutsche Grammophon)
- Serenade in G major, K.525, “Eine Kleine Nachtmusik” (Mozart) by Takacs Quartet (Decca)
An RME ADI-2 DAC FS digital-to-analog converter and Sony MDR-M1ST headphones were also used in the tests.
Overall, the hybrid method appeared to yield a higher above-the-mean response for the orchestral music, but pulled closer to the mean with less variance with the string quartet piece.
During the Q and A segment of the presentation, Mizumachi made note that with pop or jazz music listening tests, localized placement of specific instruments within the soundscape was more crucial to the perception of spaciousness due to the smaller ensembles and the specific instrumental roles within those music genres than with classical music, where the reverberant room was a more significant factor, even with the Mozart string quartet piece. Additionally, he acknowledged that the subjectivity of perception could be affected by the age of the listener, since younger people spend more time wearing headphones and engaging in video games and other content where immersive soundscapes are more common.
Survey of User Perspectives on Headphone Technology
Hosted by Milap Rane from the University of Surrey (UK) Institute of Sound Recording (IoSR), Survey of User Perspectives on Headphone Technology is co-authored by Phil Coleman, Russell Mason, and Soren Bech,
One reason the authors were motivated to conduct their survey was that there is a dearth of research literature on the topic. The few published studies are limited in their context and specificity, perhaps surprising given the current ubiquity of headphones, in-ear monitors (IEM), earbuds, and related products, and the wider range of applications that have emerged of late, especially in VR and immersive content.
The goals of the 38-question survey conducted by Rane and associates were:
- Obtain a current assessment of the average headphone user’s experience and catalog any problems they presently encounter.
- Create potential ideas, directions and solutions for future headphone technologies, to address the problems that exist utilizing both existing as well as new technological advances.
The survey asked questions about music, spoken word (podcasts, audiobooks), telecommunications, TV, video, radio and web content. The context of the survey included home, outdoors, and public transit environments.
The authors received the following results from the 406 respondents:
- The 18 – 30 age demographic comprised over half of the respondents.
- 3 percent used headphones 3 hours a day, with nearly 25 percent listening up to 6 hours a day, and 20.8 percent over 6 hours a day.
- Over-ear headphone users comprised 49.2 percent, IEM users 27.34 percent, earbud listeners 14.06 percent and on-ear users, 8.59 percent.
- Music was the most listened-to content at 39 percent, followed by telecom (phone calls) at 23 percent, film/TV, radio, and web listening at 18 percent, and spoken word at 14 percent. Earlier surveys previously had spoken word in second place.
- Contextually, home use was the overwhelming preference, ranging from 75 to 98 percent in each content application.
- Respondents were asked if they encountered issues in the following areas: degree of noise cancellation, transparency (i.e., the ability to hear desired outside environmental sounds), comfort, sound quality (intelligibility, and the ability to customize their EQ preferences), hearing fatigue, microphone sensitivity, and concerns about headphone leakage and privacy.
- New headphone features that respondents would like to have in the future included: 3-D audio, wireless connectivity across multiple devices, and more flexibility in controlling volume and the perceived stereo sound field.
- Unsurprisingly, customization preferences were wanted more for home listening, while noise cancellation was the most-desired feature for outdoor and public transit environments.
The most-desired new feature cited (at 21 percent) was for improved noise cancellation, with a call for “smart” noise cancellation that would better allow for more clearly-heard public transit announcements. 9 percent requested improved form factors in headphones design, and 8 percent wanted better sound quality. Only 6 percent requested spatial audio capability.
Automotive Audio: Tuning a Car
Car stereos are a major environment for music listening, especially during work commutes. Newer cars with onboard computers that offer GPS, audible safety alerts, and other functions make car sound system quality even more important. Automotive Audio: Tuning a Car was presented by Hope Sheffield, an acoustic engineer for Harman International and the designer of the multi-speaker sound system in the Lincoln Navigator SUV.
Sheffield, who began working with Harman in 2018, explained that the automakers dictate the performance parameters, number of speakers, space availability, budget, noise cancellation parameters, communication requirements and other parameters, and the “vehicle tuners” then design the car’s audio system to best meet those specifications, both in terms of hardware and DSP (digital signal processing) software.
Sheffield noted that the main difference between tuning a room and a car is that a room can be acoustically treated to maximize the performance of selected equipment, whereas acoustic treatment of a vehicle is not an option. This forces a vehicle tuner to tweak the equipment to best suit the automobile’s particular internal environment.
As the shape of a vehicle inherently creates interior surfaces that are parallel or close to parallel, the vehicle tuners’ primary obstacles are in dealing with reflections and phase cancellation issues within the small enclosed space.
Since vehicle cabin sizes are somewhat standardized, the tuning techniques used for each type of cabin (compact car, SUV and so on) have a particular baseline. However, interior seat material variances, different types of window glass, and other factors that can alter sound wave reflections mean that the vehicle tuners have to construct customized workarounds.
Less-than-optimum speaker placements, accounting for differing numbers of passengers, and other variables also introduce compromising elements to a vehicle tuner’s design plan. While the driver’s listening position is something of a priority, great pains are taken to tweak the sound system design to maximize the listening experience for both the front and rear passengers as well.
From an acoustician’s perspective, the design goal is to get constructive sound wave interference, where having sound waves in phase will generate a positive “bump” in clarity and fidelity, as opposed to destructive interference, where out-of-phase waves will cause different frequencies to be anywhere from attenuated to inaudible.
Sheffield noted that a properly tuned system with only four speakers can sound better than a 20-plus speaker system that suffers from unwanted reflections and phase cancellations, even though the latter might be more successfully marketed at a higher price point.
Sheffield cited the flexibility of 3-way speakers in being able to interface with a car’s DSP systems to optimize low-, midrange and high-frequency response to personal tastes without sacrificing sound quality. Also, multiple speaker arrays allow for expanded DSP capabilities that allow for immersive sound aspects, such as height perception through a vehicle’s roof and D-pillar speakers. The use of separate tweeters for high-frequency customization, and mid-woofers, woofers, and subwoofers to better subdivide and adjust the lower frequencies also would be unworkable without DSP technology.
The vehicle tuning process begins with taking graphic measurements from particular spots within a proposed vehicle, to compare the acoustic sweep responses with different speakers in the targeted cabin enclosure. Delay times are also measured, as manipulation of delay times is one of the most critical elements in the tuning process. The engineers will also use pink noise across an entire system to mimic the human perception of music and other sound sources.
The data is constantly compared with the vehicle tuners’ subjective listening assessments, to ensure that the tuners aren’t ultimately being fooled into making erroneous decisions.
Sheffield and her colleague have created a new Acoustic Prediction Tool (APT) for Harman that aids them in calculating optimum frequency levels to pinpoint the designs of their speaker crossovers, for the best overall audio response and to keep speaker coloration to a minimum.
These final calculated parameters are then applied to each vehicle model variant, and then final listening tests are conducted in an actual test prototype vehicle. The final listening tests may result in tweaking the audio system’s EQ filters, delay, channel gains, limiters and phase inversion (speaker polarity), and evaluating the system’s speaker alignment and dynamic performance. The listening-test criteria emphasizes naturalness (bass, midrange and treble frequency evenness) and image focus (left, center, right, and additional channels).
Sheffield indicated that the skew of the system’s imaging focus will often be for optimized listening from the driver’s seat, which would be on the left for North America, parts of Asia and Europe (conversely, on the right for UK, Japan, Australia, and some other nations). She noted that side speakers often will be designed to carry the midrange, which often requires delay and other manipulation for those frequencies to mesh seamlessly with the front left, center and right channels. Mounting speakers on the dashboard creates other issues because of windshield reflections, and makes achieving a balance between the front and rear more difficult. She also noted that it can be particularly challenging to tune a system to best reproduce the sound of a low-frequency instrument like an upright bass, which will often require delay or filtering to keep the bass from sounding thin, unnatural, out of phase, or sounding significantly different between the front and rear seats.
Sheffield starts her speaker alignment process by balancing the left and right speakers, then bringing in the center one(s). By tweaking the speaker polarity and delay parameters, she strives to achieve the proper level of naturalness for vocals so that the center channel(s) delivers the right level of imaging in the overall soundscape. Next, also using polarity and delay adjustments, she calibrates the rear and side speakers and subwoofer, with care taken not to cause the midrange to become indistinct.
Additional features that vehicle tuners also have to also consider are the following:
- Anti-noise technologies are used by some companies including General Motors. The process is similar to that used in noise-cancelling headphones, where microphones pick up the sound of unwanted noise, which is then combined with an out-of-phase signal to cancel it out. In the case of a vehicle, this noise cancelling maintains a quieter interior environment.
- When tuning a system designed to have surround sound capability, the system still has to sound just as good for source material not designed for surround, or for when the system is playing in stereo mode.
- Harman’s Virtual Venues surround-processing is designed to recreate the sound of different spatial characteristics from famous concert halls, theaters and arenas. It uses reverbs and algorithms within the DSP software, and the corresponding hardware needs to be compatible with the Virtual Venues technology.
- In-Car Communication (ICC) systems need to not only cleanly reproduce the sounds of the computer’s voice for safety alerts or GPS directions, but may also utilize microphones built into the car’s interior that will pick up conversation from the front and rear passengers and augment them through the car’s speakers for enhanced articulation and clarity.
Sheffield displayed a commanding knowledge of automotive audio engineering, and her insights into the particulars of vehicular sound system design were especially fascinating in light of how far in-car audio technology has advanced.
Our AES Europe Spring 2022 coverage will conclude in Part Three, which will include topics including vinyl record mastering; listening room customization; and acoustical design for houses of worship.