Development and verification of prediction models for preventing cardiovascular diseases

Ji Min Sung, In Jeong Cho, David Sung, Sunhee Kim, Hyeon Chang Kim, Myeong Hun Chae, Maryam Kavousi, Oscar L. Rueda-Ochoa, M. Arfan Ikram, Oscar H. Franco, Hyuk Jae Chang

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Objectives Cardiovascular disease (CVD) is one of the major causes of death worldwide. For improved accuracy of CVD prediction, risk classification was performed using national time-series health examination data. The data offers an opportunity to access deep learning (RNN-LSTM), which is widely known as an outstanding algorithm for analyzing time-series datasets. The objective of this study was to show the improved accuracy of deep learning by comparing the performance of a Cox hazard regression and RNN-LSTM based on survival analysis. Methods and findings We selected 361,239 subjects (age 40 to 79 years) with more than two health examination records from 2002–2006 using the National Health Insurance System-National Health Screening Cohort (NHIS-HEALS). The average number of health screenings (from 2002–2013) used in the analysis was 2.9 ± 1.0. Two CVD prediction models were developed from the NHIS-HEALS data: a Cox hazard regression model and a deep learning model. In an internal validation of the NHIS-HEALS dataset, the Cox regression model showed a highest time-dependent area under the curve (AUC) of 0.79 (95% CI 0.70 to 0.87) for in females and 0.75 (95% CI 0.70 to 0.80) in males at 2 years. The deep learning model showed a highest time-dependent AUC of 0.94 (95% CI 0.91 to 0.97) for in females and 0.96 (95% CI 0.95 to 0.97) in males at 2 years. Layer-wise Relevance Propagation (LRP) revealed that age was the variable that had the greatest effect on CVD, followed by systolic blood pressure (SBP) and diastolic blood pressure (DBP), in that order. Conclusion The performance of the deep learning model for predicting CVD occurrences was better than that of the Cox regression model. In addition, it was confirmed that the known risk factors shown to be important by previous clinical studies were extracted from the study results using LRP.

Original languageEnglish
Article numbere0222809
JournalPloS one
Volume14
Issue number9
DOIs
Publication statusPublished - 2019 Sep 1

Bibliographical note

Funding Information:
This study was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government(MSIT) (No.2018-0-00861, Intelligent SW Technology Development for Medical Data Analysis). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. This study used NHIS-HEALS data (NHIS-2016-2-132) from the National Health Insurance Service (NHIS). The authors declare no conflicts of interest with NHIS.

Publisher Copyright:
© 2019 Sung et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)
  • General

Fingerprint Dive into the research topics of 'Development and verification of prediction models for preventing cardiovascular diseases'. Together they form a unique fingerprint.

Cite this