Parametric resynthesis of measured spatial room impulse responses

Anthony Gallien¹, Benoît Alary¹, Markus Noisternig¹
anthony.gallien@ircam.fr (Corresponding author)

¹Sorbonne Université, STMS, IRCAM, CNRS, Ministère de la Culture, Paris, France

HAL DAFx

Abstract

Spatial Room Impulse Responses (SRIRs) are fundamental to immersive audio rendering and have become a key focus of recent machine learning research in acoustics and auralization. Due to the high computational cost of direct convolution, spatial audio systems commonly employ artificial reverberation algorithms. However, these approaches often fail to accurately reproduce the spatial, temporal, and spectral characteristics of early reflections, leading to notable deviations from measured SRIRs. This paper presents a comprehensive framework for the analysis and efficient resynthesis of SRIRs captured with Spherical Microphone Arrays (SMAs). The proposed method accounts for hardware-induced artifacts, including scattering and spatial aliasing. Early reflections are reconstructed using a parametric approach based on the Herglotz analysis method, while late reverberation is synthesized using a Directional Feedback Delay Network (DFDN) with optimized filter-attenuation and correlation-matching. The proposed framework produces signals that closely match the objective metrics of measured SRIRs, demonstrating its effectiveness for both the real-time spatial audio rendering and for generating realistic datasets for machine learning applications.

Evaluation

The following table presents binaural listening examples demonstrating the performance of the proposed framework. It features synthesized SRIRs from various measured rooms, convolved with dry audio sources and binaurally encoded using MagLS.

Note that EDR error figure for the Chaillot theatre reveals a notably larger error. This directly results from the measured SRIR having multi-slope decay curves, while the proposed framework assumes a single decay slope. Despite this deviation, the error is not highly perceptible when listening to the generated examples.

In the Athénée theatre example, a discrepancy is observable in the early reflection phase (prior to Tmix) on the EDR error map. This artifact stems from the Herglotz analysis failing to capture certain salient reflections, an omission that becomes audibly perceptible in the drum listening example.

Location & EDR error	Vocals	Cello	Drums	Darth Vader
IRCAM ESPRO hall dry	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method
IRCAM ESPRO hall reverberant	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method
St Eustache cathedral	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method
Chieza San Lorenzo	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method
Singer-Polignac fondation (staircase)	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method
Chaillot theatre	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method
Athénée theatre	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method	Dry sound Measurement Proposed Method