How does Auralization Protect Against Incorrect Expectations on Sound Quality?

March 27, 2017:


Music touches us. Music calms down our mind, makes us happy or frightened. There is no universal “goose bumps effect”, which works equally well. In order to create a natural listening experience, simulation and auralization are being used in room acoustics. Mvoid has developed an acoustic VR-like development environment for listening to audio systems that are based purely on computer-generated models, thus opening the possibility of evaluating sound systems by means of subjective listening tests on virtual products.


Music in your ears

You are surely familiar to the situation: you listen to music and feel emotions like joy, pleasure or excitement. Music touches us. Music calms down our mind, makes us happy or frightened. Perhaps you also know music that triggers a special mood for you because it is linked to an emotionally significant event of your life.

Music researchers and psychologists claim that the feelings and the social contact with music and playing a musical instrument have consequences for our health. Because, through positive emotions, music helps us to regenerate and strengthen our health.

Have you ever wondered what music activates in us? How does it evoke emotions? The sound path through the ear and inner ear can be followed closely by science, right down to the auditory nerve. Electrical signals are transmitted to the brain and transformed into sounds. Brain researchers and psychologists think music is a global phenomenon of the brain. As soon as music is being heard, the brain searches for the emotional meaning in the acoustic signal.

The Swedish musicologist Patrik Juslin from the University of Uppsala claims that music, especially its rhythmic part, flows directly into the brain stem, the oldest part of our brain (also called the reptilian brain). The brain stem reacts unconsciously, automatically and quickly to sounds. There, hard-wired circuits are working from evolution. If, for example, a shot is fired, the body switches to alarm level red. Quick, loud, screeching sounds drive the heartbeat, slow rhythms and deep tones have a calming effect. With acoustic effects, movies are given silence and tempo, tension and entertainment. The sense of hearing warns us of unpleasant situations. Distorted sounding music triggers fears and grief in us.

There is, however, no universal “goose bumps effect”, which works equally well. The listening environment plays an important role. Do we watch one of the latest films in the movie-theatre, do we listen to a brilliant soloist or orchestra in the concert hall, do we sit in the living room or drive in our vehicle and enjoy our favorite music. It is obvious that a vehicle cabin is acoustically not comparable with a concert hall of famous music halls.

In order to create a natural/emotional listening experience, nowadays simulation and auralization are being used in room acoustics, the acoustics of a sound system and room are made audible by their geometrical and acoustical properties. Auralization enables people to hear how sound systems in certain rooms sound. This article covers the auralization of a vehicle cabin.


How does auralization protect against incorrect expectations on sound quality?

What is auralization?

Auralization is the process of rendering a soundfield audible. This generally involves convolving an anechoic audio recording with an acoustic impulse response (Wikipedia).


Why is auralization important?

Auralization allows acousticians to make conclusions about the sound of an audio system and room already during the planning phase and to evaluate them by means of a subjective listening experience. People can hear how the audio system sounds in the vehicle cabin even before the first prototype is built.

Auralization encompasses the findings of psychoacoustics. Auralization involves the assessment of sound and the measurement of subjective perception. There is no universally valid definition of sound quality. Above all, not in terms of a metric based on numbers that make different systems easily comparable and assessable. The assessment of sound quality is based on subjective methods using listening tests. Because the human ear works much more subtly, it perceives the finest differences. Therefore, the acoustic quality must be reproduced as a listening experience.

Any sound, noise, music, or in general any signal generated, transmitted, radiated and perceived can more precisely be interpreted and compared by people if it is made audible instead of discussing “levels in frequency bands”, “single number quantities” or “dB(A)”, stated Dr. rer. Nat. University Professor M. Vorländer.


How does the auralization of a vehicle cabin work? Which elements are important?

Through the auralization of a vehicle cabin, the acoustic quality of a vehicle undergoing development is evaluated as a listening experience.

The goal of the acoustic experts is to provide a development environment which makes it possible to use the method of auralization to influence the creation of sound in the vehicle cabin. This requires a so-called acoustic fingerprint of the vehicle cabin. By applying the multiphysical simulation, a virtual product development environment is created. Methods and tools of numerical acoustics are used to calculate the best possible position of the loudspeakers and sound fields as well as the radiation of vibrating structures in the vehicle cabin.

This simulation provides first results of the acoustics in a planned vehicle cabin. The results are based exclusively on numerical calculations.

Mvoid Multiphysical Simulation of a vehicle cabin
Image: Multiphysical Simulation of a vehicle cabin: The sound field of any vehicle cabin can be analyzed by multiphysical simulation for different frequencies


However, the calculated frequency spectrum at a given location in the vehicle cabin is an abstract value. The multiphysical simulation does not make it possible to comprehensively conclude the acoustic sound quality at a particular seat, on the basis of the calculated spectrum. The spatial attributes of the sound reproduction are still not sufficiently considered in the multiphysical simulation.

Auralization is used at this point. It is about the hearing of the specific spatial properties. The auralization protects the development engineers and acoustics experts from inaccurate expectations on sound quality. During the auralization, the quality of the sound system components and the room acoustics are put to test. It makes it possible to listen to digital prototypes of audio systems and to assess their product quality. The prediction is the focus.

The auralization has an influence on the required simulation models: the multiphysical simulation model needs to be connected with the auralization.

The starting point for the virtual listening environment is the simulation of the binaural (binaural = with both ears) room impulse response (BRIR) of a loudspeaker radiating into a vehicle cabin. The BRIR is generated from a multiphysical, vibro-electroacoustical CAE-based simulation model of the loudspeaker (transducer and enclosure), the vehicle cabin as well as a model for binaural hearing of humans.


1. Identification of the binaural room impulse response (BRIR)

a) Room Impulse Response (RIR)

Room impulse responses are used to provide basic information about the time behavior of a loudspeaker or sound system.

In a first step the room impulse response, containing loudspeaker, enclosure and listening space, is simulated at positions in the vehicle where people are seated and their ears are located. Thereby the various seat positions in the entire simulation model will be considered.

Humans usually perceive the frequency range from 20 Hz to 20 kHz. As there does not exist a single numerical scheme that can be applied to cover the whole audible frequency range, a hybrid model is used. The hybrid model is based on finite elements as well as raytracing. The finite elements analyze the bass and mid regions, while raytracing analyzes the high frequency domain in the entire simulation model. This allows the RIR to be analyzed for all channels of the sound system.


b) Binaural Room Impulse Response

In a second step the binaural model, based on an analytical head-related transfer function (HRTF), is being added, and so the BRIR is generated.

In order to explain the binaural room impulse response, we would like to examine human hearing at this point:
People are able to locate sound. People have learned to use their head and their two ears to evaluate signals that reach our eardrum. If a signal or sound comes from the left, for example, the ways of sound to the left ear are shorter than the ones to the right ear. The two ear signals are also diffracted and reflected in different ways due to our anatomy (head, neck, shoulders and ears).

This means for the determination of the binaural room impulse response:

  • For a realistic virtual listening experience, all major psychoacoustical effects must be considered.
  • Room impulse responses cannot be used directly at the ears.
  • Room effects and the localization of sound are based on binaural hearing. These effects are not included in the room impulse responses.
  • At our two ears, sound events arrive with differences in time and level, caused by the relative position to the sound source as well as reflection and diffraction due to our head and torso. These directional differences, which are called Interaural Time Difference (ITD) and Interaural Level Difference (ILD), cause significant changes to incident sound waves.


These binaural effects, caused by the diffraction of the incident sound waves on the head and torso, influence the sound pressure at the ear. They can be described by an HRTF. HRTFs describe the relationship of sound pressure at the eardrum or ear canal entrance to an incident plane wave without reflection and diffraction effects, i.e., without the head or torso.

Mvoid uses an analytical model for the description of HRTFs. The model allows the derivation of time and level differences of incident sound waves and thus the determination of the HRTFs of the left and right ear. It must be noted that the model is a complete 3D model and thus HRTFs vary with azimuth (horizontally) as well as elevation (vertically) for arriving sound waves.

By determining the head-related transfer functions, it is possible to calculate the binaural room impulse responses for the left and right ear by applying the head-related transfer functions to the simulated room impulse responses coming from the multiphysical simulation model.

Despite typical binaural audio applications use only one microphone located at the center of the head to derive a single room impulse response, Mvoid uses two room impulse responses from the simulation model corresponding to the ear canal entrance location of the left and right ear. Thus, the ITD is based directly on the simulation model. This results in an improved accuracy and a more natural listening experience.

Another advantage of the analytical HRTF model is that the sound path through the ear canal is not included. At the time during playback through headphones, this part of the sound path is added by the listener’s own ear physiology. The HRTF model calculates the sound at the ear canal entrance, which is very close to the transducer used in a headphone, and thus gives realistic results.


2. Auralization and real time processing – coupling with the multiphysical simulation model

Real-time processing is of crucial importance to realize the requirements of the acoustic virtual reality. To enable this, the HRTF filters are implemented in the discrete time domain.

The BRIRs are finally convolved (in the sense of the mathematical operation) with acoustical test files (sound files containing music, speech or noise), and ultimately a binaural listening experience is created by sending the final signal to headphones.

The following image shows the signal flow of the auralization environment using Mvoid® VRtool:

Image of signal flow of the auralization environment using Mvoid® VRtool:
Image: signal flow of the auralization environment using Mvoid® VRtool:

First, test signals are sent to a routing matrix, distributing the two-channel (stereo) signal to the individual channels of the sound system. A premium sound system nowadays has 24 channels and more. Each channel passes through sound tuning by means of a module with digital signal processing capabilities before the resulting signal is sent through the simulation data. This tuning process must be integrated into the auralization environment to ensure the improvement of the sound quality for each channel in real time. The simulation data (the RIRs) will be finally rendered by means of HRTFs and ultimately sent to the headphones for a realistic listening experience.

All components of the Mvoid auralization are developed for real-time processing. Any modification of any parameter, such as tuning parameters, directly produces a visual response on the monitor (graph shows frequency or impulse response) and audible response via the headphone.


Conclusion: Mvoid’s virtual auralization generates reliable audible predictions

Mvoid has developed an acoustic VR-like development environment for listening to audio systems that are based purely on computer-generated models, thus opening the possibility of evaluating sound systems by means of subjective listening tests on virtual products. The accuracy of the method was validated by listening tests based on A/B comparisons between auralization and a real vehicle.

The performance of sound systems can thus be audibly improved in the concept phase, which leads to a shorter development time, while at the same time reducing costs and increasing engineering efficiency.


Image references:
Ear with sound, Fotolia: psdesign1 #46803420
142 Internat. AES + AES 2017 Automotive Audio: AES Audio Engineering Society Inc.
NAFEMS World Congress: NAFEMS NPO Company
Photo AS, Multiphysical Simulation Vehicle Cabin, Auralization Processs: Mvoid Technologies GmbH