A New Imputation Algorithm Based Approach for Missing Attribute Values in Databases: An Experimental Approach

Madhu G

Abstract


The presence of missing values (MV) influences the relationship between the feature and their respective class attributes values. The inappropriate handling of MV may lower the predictive capabilities of the machine learning classifiers.  Imputation  is one of the most popular approaches among  missing values handling procedures found in the literature for improving the accuracies of the classification from the tuples (nearest neighbors) with the missing value. Recent missing value imputation algorithms are based on identifying the most probable tuples for imputing the values based on the tuples of the attribute their distances. The MV is replaced either with an average value or most frequent attribute value if the attribute is real or categorical and/or integer type respectively. In this paper we provide to the readers deeper insights into the importation process and provide illustrative examples of a recent imputation algorithm based on a new indexing measure for computing the similarities between any two tuples. The imputation algorithm is applied on benchmark datasets obtained from Keel and UCI repository having different data types with varying percentage of missing values. The results demonstrate better performance of the algorithm when compared with other state-of-the art imputation algorithms.

Keywords


dataset; missing value; index measure; imputation;

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Subscribe to Print Journals

 IJAIKD is currently Indexed By   


 http://rgjournals.com/public/site/images/mittalberi/scholar_logo_lg_2011.gif  Journal Seek