Development and validation of a deep learning based diabetes prediction system using a nationwide population-based cohort

Sang Youl Rhee, Ji Min Sung, Sunhee Kim, In Jeong Cho, Sang Eun Lee, Hyuk Jae Chang

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)


Background: Previously developed prediction models for type 2 diabetes mellitus (T2DM) have limited performance. We developed a deep learning (DL) based model using a cohort representative of the Korean population. Methods: This study was conducted on the basis of the National Health Insurance Service-Health Screening (NHIS-HEALS) cohort of Korea. Overall, 335,302 subjects without T2DM at baseline were included. We developed the model based on 80% of the subjects, and verified the power in the remainder. Predictive models for T2DM were constructed using the recurrent neural network long short-term memory (RNN-LSTM) network and the Cox longitudinal summary model. The performance of both models over a 10-year period was compared using a time dependent area under the curve. Results: During a mean follow-up of 10.4±1.7 years, the mean frequency of periodic health check-ups was 2.9±1.0 per subject. During the observation period, T2DM was newly observed in 8.7% of the subjects. The annual performance of the model created using the RNN-LSTM network was superior to that of the Cox model, and the risk factors for T2DM, derived using the two models were similar; however, certain results differed. Conclusion: The DL-based T2DM prediction model, constructed using a cohort representative of the population, performs better than the conventional model. After pilot tests, this model will be provided to all Korean national health screening recipients in the future.

Original languageEnglish
Pages (from-to)515-525
Number of pages11
JournalDiabetes and Metabolism Journal
Issue number4
Publication statusPublished - 2021 Jul

Bibliographical note

Publisher Copyright:
© 2021 Korean Diabetes Association. All rights reserved.

All Science Journal Classification (ASJC) codes

  • Endocrinology, Diabetes and Metabolism


Dive into the research topics of 'Development and validation of a deep learning based diabetes prediction system using a nationwide population-based cohort'. Together they form a unique fingerprint.

Cite this