Qualitative Data Classification based on Chi-Square Dissimilarity and t-SNE
DOI:
CSTR:
Author:
Affiliation:

Shanxi Electrronic Information Institute,Shanxi Xi’an 710500

Clc Number:

TP311.13

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To solve the problem of low classification accuracy and high computational cost in the qualitative data environment, a classification variable identification method was proposed to improve the classification separability by using traditional classifiers and different mapping techniques. By mapping the initial feature (classification attribute) to the real domain space and using the chi-square (C-S) as the measure of difference, the dimension of the feature space is increased to improve the class separability. The t-distributed domain embedding algorithm (tSNE) is used to reduce the dimension of the data to two or three features, thus reducing the calculation time of the learning method. It is proved by experiments on the common classification data set that C-S mapping and t-SNE not only guarantee the recognition accuracy, but also greatly reduce the computation of recognition task. At the same time, when only C-S mapping is applied to the data set, the separability of categories is enhanced, thus significantly improving the performance of the learning algorithm.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: October 24,2024
  • Published:
Article QR Code