Abstract:Since the news content in the field of shipbuilding industry is long and professional, and contains a large number of professional vocabulary, there is currently little research on the classification of news texts in this field and the lack of corresponding shipping industry news corpus. This paper builds a shipping industry news corpus, and proposes a new text classification algorithm for ship industry news. Firstly, based on document frequency, chisquare statistic and topic model LSA, it conducts feature selection and feature dimension reduction, after mapping the documentword matrix into the documenttopics matrix, the processed features are finally classified by using support vector machine. Experiments on the classification of news texts show that the proposed algorithm can effectively solve the problem of high dimensional and high sparsity of text vectors and has better classification effect than traditional methods under the premise of small sample sets and limited categories.