Thomas Esch

Model-Based Speech Enhancement Exploiting Temporal and Spectral Dependencies

1. Auflage

162 Seiten


Reihe : ABDN

Bandnummer : 32

ISBN : 978-3-86130-359-6


Artikelnummer: 978-3-86130-359-6 Kategorie:


Mobile telephony has become an integral part of everyday life for billions of people around the world. The exchange of information via speech is nowadays possible from almost all places at anytime. However, even though the vision of permanent reachability and connectivity has been realized in the meantime nearly worldwide, there is still room for improvements when it comes to the transmission of speech under noisy conditions. The performance of any speech communication system may significantly deteriorate when the speech signal is disturbed by ambient interferences such as traffic noise or office noise, possibly leading to a poor speech quality and intelligibility. In this thesis, a novel model-based speech enhancement system is presented which performs single-channel noise reduction of degraded speech signals. In contrast to state-of-the-art noise suppression techniques, the developed algorithms explicitly exploit temporal and spectral dependencies of speech and noise signals. To account for the temporal correlation, a modified Kalman filter is derived in the frequency domain. As main novelties, the proposed solution performs complex-valued prediction of speech and noise DFT coefficients and uses SNR-dependent MMSE estimators which are adapted to measured statistics of the input signal. In order to incorporate the spectral dependencies of speech signals, a new wideband speech enhancement system is presented which utilizes techniques known from artificial bandwidth extension. The developed method re-uses the processed and enhanced signal from lower frequencies to improve the results of a conventional noise suppression technique at higher frequencies. As additional part, this work proposes effective countermeasures to reduce the occurrence of musical noise and provides a novel solution for the suppression of rapidly time-varying harmonic noise.

Gewicht 245 g
Größe 14,5 × 21,0 cm

Thomas Esch