Binaural 3D Spatialization Plugin

introduction

Spatial audio allows us to localize sound source positions across 3D, thus providing a more immersive listening experience. For example, surround sound is one way to implement spatial audio. A set of speakers is placed around listeners (e.g. 5 speakers and 1 subwoofer with 5.1) where each speaker is fed audio from a different channel to reproduce sound we perceive as localized. However, surround sound requires significant hardware given the numerous channels. In comparison, stereo audio contains just left and right channels and is more common.

Spatial audio can be created in stereo for binaural setups from a source channel and a location in space relative to the listener (i.e. azimuth, elevation, and distance). Through headphones, this stereo audio contains appropriate Intra-aural Level Differences (ILDs) and Intra-aural Time Differences (ITDs) for us to perceive this as sound in space. Additionally, the anatomy of our outer ear affects the frequency spectrum of sound as it propagates into our inner ear, providing additional cues used in localization. Taken together, we can model these characteristics of sound at a point in space propagating through air to our left and right inner ears as distinct Linear Time-Invariant systems. Thus, with the correct FIR filter coefficients, we can simply use convolution with digital audio signals to get left and right stereo channels.

Using these methods, this project synthesizes 3D spatial audio when used with headphones through the form of software audio plugin developed in MatLab. The basic functionality of the plugin will be as such:

Take in azimuth, elevation, and distance parameters for a sound source relative to the listening position in a GUI
Spatialize audio signal in realtime by applying the appropriate FIR filters for the left and right channels

measuring HRIRs

Choosing the correct filter coefficients is crucial to effectively spatialize audio. These coefficients are equivalent to Head-Related Impulse Responses (HRIRs) recorded in experiments.

HRIR measurement setup in an anechoic chamber. (source)

By placing microphones in a human subject's ears and playing impulses across a speaker array, a set of HRIRs can be recorded for each ear across multiple positions in space around the field of the subject. In reality, other signals that better maximize the signal-to-noise ratio are usually used to capture the impulse response. An anechoic chamber may be used to eliminate any acoustic reflections and isolate impulse responses. Given that the anatomy and placement of every human ear is different, HRIRs will differ across subjects.

This large set of impulse responses and the manner in which they were recorded is then stored digitally in a Spatially Oriented Format for Acoustics (SOFA) file. Datasets of these HRIRs have been published online which we will use for this project.

the MatLab prototype

The code can be found here. More information to come soon!

references

"Allen Lee (Independent) - Building a Spatial Audio Plugin"

"Read, Analyze and Process SOFA Files"

Thanks to Dr. Aaron Lanterman for mentoring and supporting this project.