Speech enhancement method based on WDGAN-div
DOI:
CSTR:
Author:
Affiliation:

1. Communication Sergeants College, PLA Army Engineering University, Chongqing 400035, China; 2. Hefei iFlytek Digital Technology limited company, Hefei 230088, China

Clc Number:

TP391.4

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Aiming at the problem of poor adaptability and unsatisfactory enhancement effects of traditional speech enhancement methods in low signal-to-noise ratio environments, this paper proposes a speech enhancement method based on Wasserstein Divergence Deep Generative Adversarial Networks. The DGAN is based on five generators and one discriminator. Five generators are used to enhance the noisy speech signal five times, which effectively improves the enhancement effect of the DGAN in low signal-to-noise ratio environments. At the same time, Wasserstein divergence is used to optimize the network training which can solve the problems in the traditional GAN training process and improve the stability of the DGAN training process. Comparing with traditional speech enhancement methods, the noise adaptability and enhancement effect of this method are significantly improved in low signal-to-noise ratio environments. The experimental results show that, compared with the original noisy speech, SegSNR of the enhanced speech is improved by an average of 6.1 dB. PESQ is increased by an average of 28.9% and STOI is increased by an average of 10.6%.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: July 08,2024
  • Published:
Article QR Code