Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity

Kangsuk Kim, Keunje Yoo, Dongwon Ki, Il Suh Son, Kyong Joo Oh, Joonhong Park

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Soilmicrobial ecology plays a significant role in global ecosystems. Nevertheless, methods of model prediction and mapping have yet to be established for soil microbial ecology. The present study was undertaken to develop an artificial-intelligence- and geographical information system (GIS)-integrated framework for predicting and mapping soil bacterial diversity using pre-existing environmental geospatial database information, and to further evaluate the applicability of soil bacterial diversity mapping for planning construction of eco-friendly roads. Using a stratified random sampling, soil bacterial diversity was measured in 196 soil samples in a forest area where construction of an eco-friendly road was planned. Model accuracy, coherence analyses, and tree analysis were systematically performed, and four-class discretized decision tree (DT) with ordinary pair-wise partitioning (OPP) was selected as the optimal model among tested five DT model variants. GIS-based simulations of the optimal DT model with varying weights assigned to soil ecological quality showed that the inclusion of soil ecology in environmental components, which are considered in environmental impact assessment, significantly affects the spatial distributions of overall environmental quality values as well as the determination of an environmentally optimized road route. This work suggests a guideline to use systematic accuracy, coherence, and tree analyses in selecting an optimal DT model from multiple candidate model variants, and demonstrates the applicability of the OPP-improved DT integrated with GIS in rule induction for mapping bacterial diversity. These findings also provide implication on the significance of soil microbial ecology in environmental impact assessment and eco-friendly construction planning.

Original languageEnglish
Pages (from-to)595-610
Number of pages16
JournalEnvironmental Monitoring and Assessment
Volume178
Issue number1-4
DOIs
Publication statusPublished - 2011 Jul 1

Fingerprint

data mining
Decision trees
Data mining
Soils
Ecology
soil
Information systems
microbial ecology
Environmental impact assessments
GIS
environmental impact assessment
road
partitioning
ecology
Planning
decision
artificial intelligence
environmental quality
Ecosystems
Spatial distribution

All Science Journal Classification (ASJC) codes

  • Environmental Science(all)
  • Pollution
  • Management, Monitoring, Policy and Law

Cite this

@article{f481f4aaa14c459892e66481e555ed13,
title = "Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity",
abstract = "Soilmicrobial ecology plays a significant role in global ecosystems. Nevertheless, methods of model prediction and mapping have yet to be established for soil microbial ecology. The present study was undertaken to develop an artificial-intelligence- and geographical information system (GIS)-integrated framework for predicting and mapping soil bacterial diversity using pre-existing environmental geospatial database information, and to further evaluate the applicability of soil bacterial diversity mapping for planning construction of eco-friendly roads. Using a stratified random sampling, soil bacterial diversity was measured in 196 soil samples in a forest area where construction of an eco-friendly road was planned. Model accuracy, coherence analyses, and tree analysis were systematically performed, and four-class discretized decision tree (DT) with ordinary pair-wise partitioning (OPP) was selected as the optimal model among tested five DT model variants. GIS-based simulations of the optimal DT model with varying weights assigned to soil ecological quality showed that the inclusion of soil ecology in environmental components, which are considered in environmental impact assessment, significantly affects the spatial distributions of overall environmental quality values as well as the determination of an environmentally optimized road route. This work suggests a guideline to use systematic accuracy, coherence, and tree analyses in selecting an optimal DT model from multiple candidate model variants, and demonstrates the applicability of the OPP-improved DT integrated with GIS in rule induction for mapping bacterial diversity. These findings also provide implication on the significance of soil microbial ecology in environmental impact assessment and eco-friendly construction planning.",
author = "Kangsuk Kim and Keunje Yoo and Dongwon Ki and Son, {Il Suh} and Oh, {Kyong Joo} and Joonhong Park",
year = "2011",
month = "7",
day = "1",
doi = "10.1007/s10661-010-1763-2",
language = "English",
volume = "178",
pages = "595--610",
journal = "Environmental Monitoring and Assessment",
issn = "0167-6369",
publisher = "Springer Netherlands",
number = "1-4",

}

Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity. / Kim, Kangsuk; Yoo, Keunje; Ki, Dongwon; Son, Il Suh; Oh, Kyong Joo; Park, Joonhong.

In: Environmental Monitoring and Assessment, Vol. 178, No. 1-4, 01.07.2011, p. 595-610.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity

AU - Kim, Kangsuk

AU - Yoo, Keunje

AU - Ki, Dongwon

AU - Son, Il Suh

AU - Oh, Kyong Joo

AU - Park, Joonhong

PY - 2011/7/1

Y1 - 2011/7/1

N2 - Soilmicrobial ecology plays a significant role in global ecosystems. Nevertheless, methods of model prediction and mapping have yet to be established for soil microbial ecology. The present study was undertaken to develop an artificial-intelligence- and geographical information system (GIS)-integrated framework for predicting and mapping soil bacterial diversity using pre-existing environmental geospatial database information, and to further evaluate the applicability of soil bacterial diversity mapping for planning construction of eco-friendly roads. Using a stratified random sampling, soil bacterial diversity was measured in 196 soil samples in a forest area where construction of an eco-friendly road was planned. Model accuracy, coherence analyses, and tree analysis were systematically performed, and four-class discretized decision tree (DT) with ordinary pair-wise partitioning (OPP) was selected as the optimal model among tested five DT model variants. GIS-based simulations of the optimal DT model with varying weights assigned to soil ecological quality showed that the inclusion of soil ecology in environmental components, which are considered in environmental impact assessment, significantly affects the spatial distributions of overall environmental quality values as well as the determination of an environmentally optimized road route. This work suggests a guideline to use systematic accuracy, coherence, and tree analyses in selecting an optimal DT model from multiple candidate model variants, and demonstrates the applicability of the OPP-improved DT integrated with GIS in rule induction for mapping bacterial diversity. These findings also provide implication on the significance of soil microbial ecology in environmental impact assessment and eco-friendly construction planning.

AB - Soilmicrobial ecology plays a significant role in global ecosystems. Nevertheless, methods of model prediction and mapping have yet to be established for soil microbial ecology. The present study was undertaken to develop an artificial-intelligence- and geographical information system (GIS)-integrated framework for predicting and mapping soil bacterial diversity using pre-existing environmental geospatial database information, and to further evaluate the applicability of soil bacterial diversity mapping for planning construction of eco-friendly roads. Using a stratified random sampling, soil bacterial diversity was measured in 196 soil samples in a forest area where construction of an eco-friendly road was planned. Model accuracy, coherence analyses, and tree analysis were systematically performed, and four-class discretized decision tree (DT) with ordinary pair-wise partitioning (OPP) was selected as the optimal model among tested five DT model variants. GIS-based simulations of the optimal DT model with varying weights assigned to soil ecological quality showed that the inclusion of soil ecology in environmental components, which are considered in environmental impact assessment, significantly affects the spatial distributions of overall environmental quality values as well as the determination of an environmentally optimized road route. This work suggests a guideline to use systematic accuracy, coherence, and tree analyses in selecting an optimal DT model from multiple candidate model variants, and demonstrates the applicability of the OPP-improved DT integrated with GIS in rule induction for mapping bacterial diversity. These findings also provide implication on the significance of soil microbial ecology in environmental impact assessment and eco-friendly construction planning.

UR - http://www.scopus.com/inward/record.url?scp=79960434254&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79960434254&partnerID=8YFLogxK

U2 - 10.1007/s10661-010-1763-2

DO - 10.1007/s10661-010-1763-2

M3 - Article

VL - 178

SP - 595

EP - 610

JO - Environmental Monitoring and Assessment

JF - Environmental Monitoring and Assessment

SN - 0167-6369

IS - 1-4

ER -