Speech enhancement method based on A-DResUnet
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

TN912. 35

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In order to extract feature information from spectrogram more accurately, this paper proposes a speech enhancement method based on A-DResUnet ( attention-dilated ResUnet). The A-DResUnet model incorporates dilated convolution on the basis of ResUnet model to improve the ability to capture the contextual information of speech; at the same time, the convolution block attention module (CBAM) is added into the ResUnet encoder to improve the attention to the features of the noise spectrogram. The experimental results show that when the noise spectrum is used as the output target of the model, the model has a stronger ability to separate unknown noise than when the output target of the model is clean speech spectrum; compared with the ResUnet model, the proposed A-DResUnet model reduces the loss of speech detail information; compared with the speech enhancement methods based on DNN and GAN, PESQ increased by an average of 22. 81%, 33. 11%, STOI increased by an average of 9. 62%, 15. 33%, which is a more effective method for speech enhancement in complex environments.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: March 29,2023
  • Published:
Article QR Code