A Multi-Resolution Approach to GAN-Based Speech Enhancement

Kim, Hyung Yong and Yoon, Ji Won and Cheon, Sung Jun and Kang, Woo Hyun and Kim, Nam Soo (2021) A Multi-Resolution Approach to GAN-Based Speech Enhancement. Applied Sciences, 11 (2). p. 721. ISSN 2076-3417

[thumbnail of applsci-11-00721-v2.pdf] Text
applsci-11-00721-v2.pdf - Published Version

Download (1MB)

Abstract

Recently, generative adversarial networks (GANs) have been successfully applied to speech enhancement. However, there still remain two issues that need to be addressed: (1) GAN-based training is typically unstable due to its non-convex property, and (2) most of the conventional methods do not fully take advantage of the speech characteristics, which could result in a sub-optimal solution. In order to deal with these problems, we propose a progressive generator that can handle the speech in a multi-resolution fashion. Additionally, we propose a multi-scale discriminator that discriminates the real and generated speech at various sampling rates to stabilize GAN training. The proposed structure was compared with the conventional GAN-based speech enhancement algorithms using the VoiceBank-DEMAND dataset. Experimental results showed that the proposed approach can make the training faster and more stable, which improves the performance on various metrics for speech enhancement.

Item Type: Article
Subjects: Pacific Library > Engineering
Depositing User: Unnamed user with email support@pacificlibrary.org
Date Deposited: 19 Jan 2023 10:50
Last Modified: 22 May 2024 09:58
URI: http://editor.classicopenlibrary.com/id/eprint/163

Actions (login required)

View Item
View Item