0%

A Two-Stage Approach to Quality Restoration of Bone-Conducted Speech

Author: Changtao Li
Email: lichangtao@mail.ioa.ac.cn

Abstract

Bone-conducted speech is not susceptible to background noise but suffers from poor speech quality and intelligibility due to the limited bandwidth. This paper proposes a two-stage approach to restore the quality of bone-conducted speech, namely, bandwidth extension and speech vocoder. In the first stage, a deep neural network is trained to learn mappings from a low-resolution representation of the bone-conducted speech, i.e., log Mel-scale spectrogram, to that of the air-conducted speech, which extends the bandwidth of the bone-conducted speech. In the second stage, a speech vocoder is employed to transform the extended log Mel-scale spectrogram of the bone-conducted speech back to time-domain waveforms. Due to the many-to-many correspondence between the air-conducted and bone-conducted speech, supervised learning may not be the best training protocol for the bone-conducted/air-conducted feature mapping. We thus propose to leverage adversarial training to further improve the bandwidth extension performance in the first stage. The two stages are decoupled and can be trained independently. The vocoder is trained on a large multi-speaker dataset and can generalize well to unknown speakers. Also, the vocoder can help to remedy the spectral artifacts introduced in the bandwidth extension stage. Objective and subjective evaluations on ESMB dataset show that the proposed two-stage system substantially outperforms existing bone-conducted speech enhancement systems.

Samples

We encourage our readers to listen to the following audio samples in order to experience the audio quality of our enhanced speech. Due to the large number of audio files, the loading time might be prolonged. Kindly wait for a moment or consider closing other web pages and refreshing the website to improve the loading speed.

Sample1 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample2 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample3 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample4 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample5 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample6 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample7 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample8 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample9 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
<
Sample10 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample11 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample12 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample13 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample14 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample15 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample16 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample17 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample18 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample19 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample20 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample21 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample22 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample23 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample24 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample25 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample26 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample27 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample28 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample29 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC
Sample30 DCCRN-AF DCCRN-BC S2 DPT-EGNet
Proposed S1 WaveNet BC
AC