A new criterion in selection and discretization of attributes for the generation of decision trees

Byung Hwan Jun, Chang Soo Kirn, Hong Yeop Song, Jaihie Kirn

Research output: Contribution to journalArticle

34 Citations (Scopus)

Abstract

It is important to use a better criterion in selection and discretization of attributes for the generation of decision trees to construct a better classifier in the area of pattern recognition in order to intelligently access huge amount of data efficiently. Two well-known criteria are gain and gain ratio, both based on the entropy of partitions. We propose in this paper a new criterion based also on entropy, and use both theoretical analysis and computer simulation to demonstrate that it works better than gain or gain ratio in a wide variety of situations. We use the usual entropy calculation where the base of the logarithm is not two but the number of successors to the node. Our theoretical analysis leads some specific situations in which the new criterion works always better than gain or gain ratio, and the simulation result may implicitly cover all the other situations not covered by the analysis.

Original languageEnglish
Pages (from-to)1371-1375
Number of pages5
JournalIEEE transactions on pattern analysis and machine intelligence
Volume19
Issue number12
DOIs
Publication statusPublished - 1997 Dec 1

Fingerprint

Decision trees
Decision tree
Entropy
Discretization
Attribute
Pattern recognition
Theoretical Analysis
Classifiers
Computer simulation
Logarithm
Pattern Recognition
Computer Simulation
Classifier
Partition
Cover
Vertex of a graph
Demonstrate
Simulation

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Applied Mathematics

Cite this

@article{85d0b16a618347bf850916c3002465a5,
title = "A new criterion in selection and discretization of attributes for the generation of decision trees",
abstract = "It is important to use a better criterion in selection and discretization of attributes for the generation of decision trees to construct a better classifier in the area of pattern recognition in order to intelligently access huge amount of data efficiently. Two well-known criteria are gain and gain ratio, both based on the entropy of partitions. We propose in this paper a new criterion based also on entropy, and use both theoretical analysis and computer simulation to demonstrate that it works better than gain or gain ratio in a wide variety of situations. We use the usual entropy calculation where the base of the logarithm is not two but the number of successors to the node. Our theoretical analysis leads some specific situations in which the new criterion works always better than gain or gain ratio, and the simulation result may implicitly cover all the other situations not covered by the analysis.",
author = "Jun, {Byung Hwan} and Kirn, {Chang Soo} and Song, {Hong Yeop} and Jaihie Kirn",
year = "1997",
month = "12",
day = "1",
doi = "10.1109/34.643896",
language = "English",
volume = "19",
pages = "1371--1375",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",
number = "12",

}

A new criterion in selection and discretization of attributes for the generation of decision trees. / Jun, Byung Hwan; Kirn, Chang Soo; Song, Hong Yeop; Kirn, Jaihie.

In: IEEE transactions on pattern analysis and machine intelligence, Vol. 19, No. 12, 01.12.1997, p. 1371-1375.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A new criterion in selection and discretization of attributes for the generation of decision trees

AU - Jun, Byung Hwan

AU - Kirn, Chang Soo

AU - Song, Hong Yeop

AU - Kirn, Jaihie

PY - 1997/12/1

Y1 - 1997/12/1

N2 - It is important to use a better criterion in selection and discretization of attributes for the generation of decision trees to construct a better classifier in the area of pattern recognition in order to intelligently access huge amount of data efficiently. Two well-known criteria are gain and gain ratio, both based on the entropy of partitions. We propose in this paper a new criterion based also on entropy, and use both theoretical analysis and computer simulation to demonstrate that it works better than gain or gain ratio in a wide variety of situations. We use the usual entropy calculation where the base of the logarithm is not two but the number of successors to the node. Our theoretical analysis leads some specific situations in which the new criterion works always better than gain or gain ratio, and the simulation result may implicitly cover all the other situations not covered by the analysis.

AB - It is important to use a better criterion in selection and discretization of attributes for the generation of decision trees to construct a better classifier in the area of pattern recognition in order to intelligently access huge amount of data efficiently. Two well-known criteria are gain and gain ratio, both based on the entropy of partitions. We propose in this paper a new criterion based also on entropy, and use both theoretical analysis and computer simulation to demonstrate that it works better than gain or gain ratio in a wide variety of situations. We use the usual entropy calculation where the base of the logarithm is not two but the number of successors to the node. Our theoretical analysis leads some specific situations in which the new criterion works always better than gain or gain ratio, and the simulation result may implicitly cover all the other situations not covered by the analysis.

UR - http://www.scopus.com/inward/record.url?scp=0031360754&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0031360754&partnerID=8YFLogxK

U2 - 10.1109/34.643896

DO - 10.1109/34.643896

M3 - Article

AN - SCOPUS:0031360754

VL - 19

SP - 1371

EP - 1375

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 12

ER -