The USC Andrew and Erna Viterbi School of Engineering USC Signal and Image Processing Institute USC Ming Hsieh Department of Electrical Engineering University of Southern California

Technical Report USC-SIPI-359

“Time-Frequency and Adaptive Signal Processing Methods for Immersive Audio Virtual Acquisition and Rendering”

by Athanasios N. Mouchtaris

May 2003

This dissertation is concerned with the enhancement of an existing audio acquisition/reproduction system by both providing more loudspeakers for rendering (virtual rendering) as well as multiple input audio channels (virtual acquisition). By providing virtual microphones and virtual loudspeakers, the methods proposed here can transform a given microphone/loudspeaker setting into a truly immersive environment. Immersive audio virtual rendering systems can be used to render virtual sound sources in three-dimensional space around a listener. This is achieved by simulating the Head-Related Transfer Function (HRTF) amplitude and phase characteristics using digital filters. In this work we examine certain key signal processing considerations in spatial sound rendering over headphones and loudspeakers. We address the problem of crosstalk, inherent in loudspeaker rendering, and examine two methods for implementing crosstalk cancellation and loudspeaker frequency response inversion efficiently. We demonstrate that it is possible to achieve crosstalk cancellation of 30 dB using both methods. Our analysis is extended to non-symmetric listening positions and moving listeners. A method for generating the required crosstalk cancellation filters as the listener moves is developed based on Low-Rank modeling. Using the Karhunen-Loeve expansion of the crosstalk filters we can interpolate among designed filters to synthesize new ones, for which HRTF measurements are unavailable.

To download the report in PDF format click here: USC-SIPI-359.pdf (39.1Mb)