Text classification for ship industry news
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

TP391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Since the news content in the field of shipbuilding industry is long and professional, and contains a large number of professional vocabulary, there is currently little research on the classification of news texts in this field and the lack of corresponding shipping industry news corpus. This paper builds a shipping industry news corpus, and proposes a new text classification algorithm for ship industry news. Firstly, based on document frequency, chisquare statistic and topic model LSA, it conducts feature selection and feature dimension reduction, after mapping the documentword matrix into the documenttopics matrix, the processed features are finally classified by using support vector machine. Experiments on the classification of news texts show that the proposed algorithm can effectively solve the problem of high dimensional and high sparsity of text vectors and has better classification effect than traditional methods under the premise of small sample sets and limited categories.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: June 15,2023
  • Published: January 31,2020
Article QR Code