[PDF][PDF] On biases in estimating multi-valued attributes

I Kononenko - Ijcai, 1995 - Citeseer
Ijcai, 1995Citeseer
We analyse the biases of eleven measures for estimating the quality of the multi-valued
attributes. The values of information gain, J-measure, gini-index, and relevance tend to
linearly increase with the number of values of an attribute. The values of gain-ratio, distance
measure, Relief, and the weight of evidence decrease for informative attributes and increase
for irrelevant attributes. The bias of the statistic tests based on the chi-square distribution is
similar but these functions are not able to discriminate among the attributes of di erent …
Abstract
We analyse the biases of eleven measures for estimating the quality of the multi-valued attributes. The values of information gain, J-measure, gini-index, and relevance tend to linearly increase with the number of values of an attribute. The values of gain-ratio, distance measure, Relief, and the weight of evidence decrease for informative attributes and increase for irrelevant attributes. The bias of the statistic tests based on the chi-square distribution is similar but these functions are not able to discriminate among the attributes of di erent quality. We also introduce a new function based on the MDL principle whose value slightly decreases with the increasing number of attribute's values.
Citeseer
Showing the best result for this search. See all results