The Estimation of Probabilities in Attribute Selection Measures for Decision Tree Induction

Bojan Cestnik, Aram Karalic

Abstract

In this paper we analyze two well-known measures for attribute selection in decision tree induction, informativity and gini index. In particular, we are interested in the influence of different methods for estimating probabilities on these two measures. The results of experiments show that different measures, which are obtained by different probability estimation methods, determine the preferential order of attributes in a given node. Therefore, they determine the structure of a constructed decision tree. This feature can be very beneficial, especially in real-world applications where several different trees are often required.

Paper.ps