Three channel pathological speech recognition based on LMD improved feature extraction
DOI:
CSTR:
Author:
Affiliation:

School of Information and Communication Engineering, North University of China,Taiyuan 030024, China

Clc Number:

TN912.34; R741

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Aiming at the problem that patients with dysphonia lack clear and accurate pronunciation, which leads to low pathological speech recognition rate, an improved Gammatone Filter Bank map feature extraction algorithm based on LMD is proposed for three channel pathological speech recognition. Firstly, the algorithm uses LMD to decompose speech signals, performs short-time Fourier transform on each decomposed speech component, and synthesizes frequency to extract filter bank features and their first-order and second-order differential features, forming LMD-GFbank map features that can obtain effective local features of pathological speech. Secondly, in order to further improve the problem that the network model will miss some effective feature information during the training process, a three-way pathological speech recognition model is proposed. Finally, the pathological speech recognition model is trained and tested by combining the speech feature information. The experimental results show that the recognition rate of LMD-GFbank map features on the three channel pathological speech recognition model reaches 93.36%, which is better than the speech recognition performance of traditional MFCC, GFCC, and Fbank features, and verified that the proposed algorithm and recognition model can improve the accuracy of pathological speech recognition.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: November 04,2024
  • Published:
Article QR Code