Parallel Discretization of Data Preparation Optimization in Data Mining
DOI:
Author:
Affiliation:

Clc Number:

TN929.5

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
    Abstract:

    In data mining, the discretization of data can improve the efficiency of data mining effectively. In this paper, we propose a data preprocessing algorithm (AOA) to obtain the optimal discretization using parallel comparison. For different data sets, we first perform the feature detection of the data set to obtain the distribution characteristics of the data set. Then the outliers of the data set are detected according to the distribution characteristics. IN addition, the discretization results are obtained by comparing the minimum Euclidean distance of the entropy, the variance index and the stability parameter of the different discretization methods. In simulation experiment, we compare the AOA with four typical data discretization methods in different databases by running the association rule mining algorithm on the discretization data obtained using AOA and other four methods, respectively. The results show that, under different minimum support thresholds, the number of association rules extracted from the discretization data obtained using AOA is the least, indicating higher efficiency of AOA.

    Reference
    Related
    Cited by
Get Citation

Cite this article as: LIU Yun, YUAN Hao-Heng. Parallel Discretization of Data Preparation Optimization in Data Mining [J]. J Sichuan Univ: Nat Sci Ed, 2018, 55: 993.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 24,2017
  • Revised:January 05,2018
  • Adopted:January 23,2018
  • Online: September 30,2018
  • Published: