Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea

Soo Beom Choi, Wanhyung Lee, JinHa Yoon, Jong Uk Won, Deok Won Kim

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Background: Death by suicide is a preventable public health concern worldwide. The aim of this study is to investigate the probability of suicide death using baseline characteristics and simple medical facility visit history data using Cox regression, support vector machines (SVMs), and deep neural networks (DNNs). Method: This study included 819,951 subjects in the National Health Insurance Service (NHIS)–Cohort Sample Database from 2004 to 2013. The dataset was divided randomly into two independent training and validation groups. To improve the performance of predicting suicide death, we applied SVM and DNN to the same training set as the Cox regression model. Results: Among the study population, 2546 people died by intentional self-harm during the follow-up time. Sex, age, type of insurance, household income, disability, and medical records of eight ICD-10 codes (including mental and behavioural disorders) were selected by a Cox regression model with backward stepwise elimination. The area of under the curve (AUC) of Cox regression (0.688), SVM (0.687), and DNN (0.683) were approximately the same. The group with top.5% of predicted probability had hazard ratio of 26.21 compared to that with the lowest 10% of predicted probability. Limitations: This study is limited by the lack of information on suicidal ideation and attempts, other potential covariates such as information of medication and subcategory ICD-10 codes. Moreover, predictors from the prior 12–24 months of the date of death could be expected to show better performances than predictors from up to 10 years ago. Conclusions: We suggest a 10-year probability prediction model for suicide death using general characteristics and simple insurance data, which are annually conducted by the Korean government. Suicide death prevention might be enhanced by our prediction model.

Original languageEnglish
Pages (from-to)8-14
Number of pages7
JournalJournal of Affective Disorders
Volume231
DOIs
Publication statusPublished - 2018 Apr 15

Fingerprint

Republic of Korea
Suicide
Cohort Studies
Retrospective Studies
International Classification of Diseases
National Health Programs
Insurance
Proportional Hazards Models
Suicidal Ideation
Mental Disorders
Area Under Curve
Medical Records
Machine Learning
Public Health
History
Databases
Population
Support Vector Machine

All Science Journal Classification (ASJC) codes

  • Clinical Psychology
  • Psychiatry and Mental health

Cite this

@article{60407184794b4329a2a83ef33f13d91b,
title = "Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea",
abstract = "Background: Death by suicide is a preventable public health concern worldwide. The aim of this study is to investigate the probability of suicide death using baseline characteristics and simple medical facility visit history data using Cox regression, support vector machines (SVMs), and deep neural networks (DNNs). Method: This study included 819,951 subjects in the National Health Insurance Service (NHIS)–Cohort Sample Database from 2004 to 2013. The dataset was divided randomly into two independent training and validation groups. To improve the performance of predicting suicide death, we applied SVM and DNN to the same training set as the Cox regression model. Results: Among the study population, 2546 people died by intentional self-harm during the follow-up time. Sex, age, type of insurance, household income, disability, and medical records of eight ICD-10 codes (including mental and behavioural disorders) were selected by a Cox regression model with backward stepwise elimination. The area of under the curve (AUC) of Cox regression (0.688), SVM (0.687), and DNN (0.683) were approximately the same. The group with top.5{\%} of predicted probability had hazard ratio of 26.21 compared to that with the lowest 10{\%} of predicted probability. Limitations: This study is limited by the lack of information on suicidal ideation and attempts, other potential covariates such as information of medication and subcategory ICD-10 codes. Moreover, predictors from the prior 12–24 months of the date of death could be expected to show better performances than predictors from up to 10 years ago. Conclusions: We suggest a 10-year probability prediction model for suicide death using general characteristics and simple insurance data, which are annually conducted by the Korean government. Suicide death prevention might be enhanced by our prediction model.",
author = "Choi, {Soo Beom} and Wanhyung Lee and JinHa Yoon and Won, {Jong Uk} and Kim, {Deok Won}",
year = "2018",
month = "4",
day = "15",
doi = "10.1016/j.jad.2018.01.019",
language = "English",
volume = "231",
pages = "8--14",
journal = "Journal of Affective Disorders",
issn = "0165-0327",
publisher = "Elsevier",

}

Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea. / Choi, Soo Beom; Lee, Wanhyung; Yoon, JinHa; Won, Jong Uk; Kim, Deok Won.

In: Journal of Affective Disorders, Vol. 231, 15.04.2018, p. 8-14.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea

AU - Choi, Soo Beom

AU - Lee, Wanhyung

AU - Yoon, JinHa

AU - Won, Jong Uk

AU - Kim, Deok Won

PY - 2018/4/15

Y1 - 2018/4/15

N2 - Background: Death by suicide is a preventable public health concern worldwide. The aim of this study is to investigate the probability of suicide death using baseline characteristics and simple medical facility visit history data using Cox regression, support vector machines (SVMs), and deep neural networks (DNNs). Method: This study included 819,951 subjects in the National Health Insurance Service (NHIS)–Cohort Sample Database from 2004 to 2013. The dataset was divided randomly into two independent training and validation groups. To improve the performance of predicting suicide death, we applied SVM and DNN to the same training set as the Cox regression model. Results: Among the study population, 2546 people died by intentional self-harm during the follow-up time. Sex, age, type of insurance, household income, disability, and medical records of eight ICD-10 codes (including mental and behavioural disorders) were selected by a Cox regression model with backward stepwise elimination. The area of under the curve (AUC) of Cox regression (0.688), SVM (0.687), and DNN (0.683) were approximately the same. The group with top.5% of predicted probability had hazard ratio of 26.21 compared to that with the lowest 10% of predicted probability. Limitations: This study is limited by the lack of information on suicidal ideation and attempts, other potential covariates such as information of medication and subcategory ICD-10 codes. Moreover, predictors from the prior 12–24 months of the date of death could be expected to show better performances than predictors from up to 10 years ago. Conclusions: We suggest a 10-year probability prediction model for suicide death using general characteristics and simple insurance data, which are annually conducted by the Korean government. Suicide death prevention might be enhanced by our prediction model.

AB - Background: Death by suicide is a preventable public health concern worldwide. The aim of this study is to investigate the probability of suicide death using baseline characteristics and simple medical facility visit history data using Cox regression, support vector machines (SVMs), and deep neural networks (DNNs). Method: This study included 819,951 subjects in the National Health Insurance Service (NHIS)–Cohort Sample Database from 2004 to 2013. The dataset was divided randomly into two independent training and validation groups. To improve the performance of predicting suicide death, we applied SVM and DNN to the same training set as the Cox regression model. Results: Among the study population, 2546 people died by intentional self-harm during the follow-up time. Sex, age, type of insurance, household income, disability, and medical records of eight ICD-10 codes (including mental and behavioural disorders) were selected by a Cox regression model with backward stepwise elimination. The area of under the curve (AUC) of Cox regression (0.688), SVM (0.687), and DNN (0.683) were approximately the same. The group with top.5% of predicted probability had hazard ratio of 26.21 compared to that with the lowest 10% of predicted probability. Limitations: This study is limited by the lack of information on suicidal ideation and attempts, other potential covariates such as information of medication and subcategory ICD-10 codes. Moreover, predictors from the prior 12–24 months of the date of death could be expected to show better performances than predictors from up to 10 years ago. Conclusions: We suggest a 10-year probability prediction model for suicide death using general characteristics and simple insurance data, which are annually conducted by the Korean government. Suicide death prevention might be enhanced by our prediction model.

UR - http://www.scopus.com/inward/record.url?scp=85041482472&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041482472&partnerID=8YFLogxK

U2 - 10.1016/j.jad.2018.01.019

DO - 10.1016/j.jad.2018.01.019

M3 - Article

VL - 231

SP - 8

EP - 14

JO - Journal of Affective Disorders

JF - Journal of Affective Disorders

SN - 0165-0327

ER -