Inducing decision trees is a popular method in machine learning. The information gain computed for each attribute and its threshold helps finding a small number of rules for data classification. However, there has been little research on how many rules are appropriate for a given set of data. In this paper, an evolutionary multi-objective optimization approach with genetic programming will be applied to the data classification problem in order to find the minimum error rate for each size of decision trees. Following structural risk minimization suggested by Vapnik, we can determine a desirable number of rules with the best generalization performance. A hierarchy of decision trees for classification performance can be provided and it is compared with C4.5 application.