Suicidality Detection on Social Media Using Metadata and Text Feature Extraction and Machine Learning

Woojin Jung, Donghun Kim, Seojin Nam, Yongjun Zhu

Research output: Contribution to journalArticlepeer-review

Abstract

In this study, we implemented machine learning models that can detect suicidality posts on Twitter. We randomly selected and annotated 20,000 tweets and explored metadata and text features to build effective models. Metadata features were studied in great details to understand their possibility and importance in suicidality detection models. Results showed that posting type (i.e., reply or not) and time-related features such as the month, day of the week, and the time (AM vs. PM) were the most important metadata features in suicidality detection models. Specifically, the probability of a social media post being suicidal is higher if the post is a reply to other users rather than an original tweet. Moreover, tweets created in in the afternoon, on Fridays and weekends, and in fall have higher probabilities of being detected as suicidality tweets compared with those created in other times. By integrating metadata and text features, we obtained a model of good performance (i.e., F1 score of 0.846) that can assist humans in the real-world setting to detect suicidality social media posts.

Original languageEnglish
JournalArchives of Suicide Research
DOIs
Publication statusAccepted/In press - 2021

Bibliographical note

Publisher Copyright:
© 2021 International Academy for Suicide Research.

All Science Journal Classification (ASJC) codes

  • Clinical Psychology
  • Psychiatry and Mental health

Fingerprint

Dive into the research topics of 'Suicidality Detection on Social Media Using Metadata and Text Feature Extraction and Machine Learning'. Together they form a unique fingerprint.

Cite this