Predictive data mining for diagnosing periodontal disease: the Korea National Health and Nutrition Examination Surveys (KNHANES V and VI) from 2010 to 2015

Jae Hong Lee, Seong Nyum Jeong, Seongho Choi

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Objectives: This study aimed to identify patients with the highest risk of periodontal disease (PD), and to provide recommendations for the effective use and application of data mining (DM) techniques when establishing evidence-based dental-care policies for vulnerable groups at a high risk of PD. Methods: This study used the SEMMA (Sample, Explore, Modify, Model, and Assess) methodology to construct DM models based on data acquired from the fifth and sixth Korea National Health and Nutrition Examination Surveys (2000-2015). We analyzed the sociodemographic and comorbidity variables that influence PD by applying the popular DM techniques of decision-tree, neural-network, and regression models, and also attempted to improve the predictive power and reliability by comparing the results obtained by these three models. Results: Our comparisons of the three DM algorithms confirmed that the average squared error, misclassification rate, receiver operating characteristic index, Gini coefficient, and Kolmogorov–Smirnov test results were the most appropriate for the decision-tree model. The analysis of the decision-tree model revealed that age and smoking status exert major effects on the risk of PD, and that stress and education level exert effects in rural areas, whereas education level, sex, hyperlipidemia, and alcohol intake exert effects in urban areas. Conclusions: We demonstrated that the decision-tree model is an effective DM technique for identifying the complex risk factors for PD. These results are expected to be helpful in improving the equality and efficacy of dental-care policies for vulnerable groups at a high risk of PD.

Original languageEnglish
JournalJournal of Public Health Dentistry
DOIs
Publication statusAccepted/In press - 2018 Jan 1

Fingerprint

Data Mining
Nutrition Surveys
Periodontal Diseases
Korea
Decision Trees
Dental Care
Neural Networks (Computer)
Sex Education
Hyperlipidemias
Reproducibility of Results
ROC Curve
Comorbidity
Smoking
Alcohols
Education

All Science Journal Classification (ASJC) codes

  • Dentistry(all)
  • Public Health, Environmental and Occupational Health

Cite this

@article{43467da9a25241f6aa5004833f07e38a,
title = "Predictive data mining for diagnosing periodontal disease: the Korea National Health and Nutrition Examination Surveys (KNHANES V and VI) from 2010 to 2015",
abstract = "Objectives: This study aimed to identify patients with the highest risk of periodontal disease (PD), and to provide recommendations for the effective use and application of data mining (DM) techniques when establishing evidence-based dental-care policies for vulnerable groups at a high risk of PD. Methods: This study used the SEMMA (Sample, Explore, Modify, Model, and Assess) methodology to construct DM models based on data acquired from the fifth and sixth Korea National Health and Nutrition Examination Surveys (2000-2015). We analyzed the sociodemographic and comorbidity variables that influence PD by applying the popular DM techniques of decision-tree, neural-network, and regression models, and also attempted to improve the predictive power and reliability by comparing the results obtained by these three models. Results: Our comparisons of the three DM algorithms confirmed that the average squared error, misclassification rate, receiver operating characteristic index, Gini coefficient, and Kolmogorov–Smirnov test results were the most appropriate for the decision-tree model. The analysis of the decision-tree model revealed that age and smoking status exert major effects on the risk of PD, and that stress and education level exert effects in rural areas, whereas education level, sex, hyperlipidemia, and alcohol intake exert effects in urban areas. Conclusions: We demonstrated that the decision-tree model is an effective DM technique for identifying the complex risk factors for PD. These results are expected to be helpful in improving the equality and efficacy of dental-care policies for vulnerable groups at a high risk of PD.",
author = "Lee, {Jae Hong} and Jeong, {Seong Nyum} and Seongho Choi",
year = "2018",
month = "1",
day = "1",
doi = "10.1111/jphd.12293",
language = "English",
journal = "Journal of Public Health Dentistry",
issn = "0022-4006",
publisher = "Wiley-Blackwell",

}

TY - JOUR

T1 - Predictive data mining for diagnosing periodontal disease

T2 - the Korea National Health and Nutrition Examination Surveys (KNHANES V and VI) from 2010 to 2015

AU - Lee, Jae Hong

AU - Jeong, Seong Nyum

AU - Choi, Seongho

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Objectives: This study aimed to identify patients with the highest risk of periodontal disease (PD), and to provide recommendations for the effective use and application of data mining (DM) techniques when establishing evidence-based dental-care policies for vulnerable groups at a high risk of PD. Methods: This study used the SEMMA (Sample, Explore, Modify, Model, and Assess) methodology to construct DM models based on data acquired from the fifth and sixth Korea National Health and Nutrition Examination Surveys (2000-2015). We analyzed the sociodemographic and comorbidity variables that influence PD by applying the popular DM techniques of decision-tree, neural-network, and regression models, and also attempted to improve the predictive power and reliability by comparing the results obtained by these three models. Results: Our comparisons of the three DM algorithms confirmed that the average squared error, misclassification rate, receiver operating characteristic index, Gini coefficient, and Kolmogorov–Smirnov test results were the most appropriate for the decision-tree model. The analysis of the decision-tree model revealed that age and smoking status exert major effects on the risk of PD, and that stress and education level exert effects in rural areas, whereas education level, sex, hyperlipidemia, and alcohol intake exert effects in urban areas. Conclusions: We demonstrated that the decision-tree model is an effective DM technique for identifying the complex risk factors for PD. These results are expected to be helpful in improving the equality and efficacy of dental-care policies for vulnerable groups at a high risk of PD.

AB - Objectives: This study aimed to identify patients with the highest risk of periodontal disease (PD), and to provide recommendations for the effective use and application of data mining (DM) techniques when establishing evidence-based dental-care policies for vulnerable groups at a high risk of PD. Methods: This study used the SEMMA (Sample, Explore, Modify, Model, and Assess) methodology to construct DM models based on data acquired from the fifth and sixth Korea National Health and Nutrition Examination Surveys (2000-2015). We analyzed the sociodemographic and comorbidity variables that influence PD by applying the popular DM techniques of decision-tree, neural-network, and regression models, and also attempted to improve the predictive power and reliability by comparing the results obtained by these three models. Results: Our comparisons of the three DM algorithms confirmed that the average squared error, misclassification rate, receiver operating characteristic index, Gini coefficient, and Kolmogorov–Smirnov test results were the most appropriate for the decision-tree model. The analysis of the decision-tree model revealed that age and smoking status exert major effects on the risk of PD, and that stress and education level exert effects in rural areas, whereas education level, sex, hyperlipidemia, and alcohol intake exert effects in urban areas. Conclusions: We demonstrated that the decision-tree model is an effective DM technique for identifying the complex risk factors for PD. These results are expected to be helpful in improving the equality and efficacy of dental-care policies for vulnerable groups at a high risk of PD.

UR - http://www.scopus.com/inward/record.url?scp=85057140056&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057140056&partnerID=8YFLogxK

U2 - 10.1111/jphd.12293

DO - 10.1111/jphd.12293

M3 - Article

C2 - 30468241

JO - Journal of Public Health Dentistry

JF - Journal of Public Health Dentistry

SN - 0022-4006

ER -