Acoustic Matching By Embedding Impulse Responses

Author(s): Su, Jiaqi; Jin, Zeyu; Finkelstein, Adam

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1d84j

Abstract:	The goal of acoustic matching is to transform an audio recording made in one acoustic environment to sound as if it had been recorded in a different environment, based on reference audio from the target environment. This paper introduces a deep learning solution for two parts of the acoustic matching problem. First, we characterize acoustic environments by mapping audio into a low-dimensional embedding invariant to speech content and speaker identity. Next, a waveform-to-waveform neural network conditioned on this embedding learns to transform an input waveform to match the acoustic qualities encoded in the target embedding. Listening tests on both simulated and real environments show that the proposed approach improves on state-of-the-art baseline methods.
Publication Date:	2020
Citation:	Su, Jiaqi, Zeyu Jin, and Adam Finkelstein. "Acoustic Matching By Embedding Impulse Responses." In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020): pp. 426-430. doi:10.1109/ICASSP40776.2020.9054701
DOI:	10.1109/ICASSP40776.2020.9054701
ISSN:	1520-6149
EISSN:	2379-190X
Pages:	426 - 430
Type of Material:	Conference Article
Journal/Proceeding Title:	IEEE International Conference on Acoustics, Speech and Signal Processing
Version:	Author's manuscript