An application of convolutional neural networks with salient features for relation classification

Zolzaya Dashdorj, Min Song

Research output: Contribution to journalArticle

Abstract

Background: Due to the advent of deep learning, the increasing number of studies in the biomedical domain has attracted much interest in feature extraction and classification tasks. In this research, we seek the best combination of feature set and hyperparameter setting of deep learning algorithms for relation classification. To this end, we incorporate an entity and relation extraction tool, PKDE4J to extract biomedical features (i.e., biomedical entities, relations) for the relation classification. We compared the chosen Convolutional Neural Networks (CNN) based classification model with the most widely used learning algorithms. Results: Our CNN based classification model outperforms the most widely used supervised algorithms. We achieved a significant performance on binary classification with a weighted macro-average F1-score: 94.79% using pre-extracted relevant feature combinations. For multi-class classification, the weighted macro-average F1-score is estimated around 86.95%. Conclusions: Our results suggest that our proposed CNN based model using the not only single feature as the raw text of the sentences of biomedical literature, but also coupling with multiple and highlighted features extracted from the biomedical sentences could improve the classification performance significantly. We offer hyperparameter tuning and optimization approaches for our proposed model to obtain optimal hyperparameters of the models with the best performance.

Original languageEnglish
Article number244
JournalBMC bioinformatics
Volume20
DOIs
Publication statusPublished - 2019 May 29

Fingerprint

Neural Networks
Neural networks
Hyperparameters
Learning Algorithm
Learning
Learning algorithms
Macros
Multi-class Classification
Binary Classification
Model
Feature Extraction
Neural Networks (Computer)
Tuning
Feature extraction
Optimization
Research
Deep learning

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

@article{de4e826c545a45df811778a3a99aeafe,
title = "An application of convolutional neural networks with salient features for relation classification",
abstract = "Background: Due to the advent of deep learning, the increasing number of studies in the biomedical domain has attracted much interest in feature extraction and classification tasks. In this research, we seek the best combination of feature set and hyperparameter setting of deep learning algorithms for relation classification. To this end, we incorporate an entity and relation extraction tool, PKDE4J to extract biomedical features (i.e., biomedical entities, relations) for the relation classification. We compared the chosen Convolutional Neural Networks (CNN) based classification model with the most widely used learning algorithms. Results: Our CNN based classification model outperforms the most widely used supervised algorithms. We achieved a significant performance on binary classification with a weighted macro-average F1-score: 94.79{\%} using pre-extracted relevant feature combinations. For multi-class classification, the weighted macro-average F1-score is estimated around 86.95{\%}. Conclusions: Our results suggest that our proposed CNN based model using the not only single feature as the raw text of the sentences of biomedical literature, but also coupling with multiple and highlighted features extracted from the biomedical sentences could improve the classification performance significantly. We offer hyperparameter tuning and optimization approaches for our proposed model to obtain optimal hyperparameters of the models with the best performance.",
author = "Zolzaya Dashdorj and Min Song",
year = "2019",
month = "5",
day = "29",
doi = "10.1186/s12859-019-2808-3",
language = "English",
volume = "20",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

An application of convolutional neural networks with salient features for relation classification. / Dashdorj, Zolzaya; Song, Min.

In: BMC bioinformatics, Vol. 20, 244, 29.05.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An application of convolutional neural networks with salient features for relation classification

AU - Dashdorj, Zolzaya

AU - Song, Min

PY - 2019/5/29

Y1 - 2019/5/29

N2 - Background: Due to the advent of deep learning, the increasing number of studies in the biomedical domain has attracted much interest in feature extraction and classification tasks. In this research, we seek the best combination of feature set and hyperparameter setting of deep learning algorithms for relation classification. To this end, we incorporate an entity and relation extraction tool, PKDE4J to extract biomedical features (i.e., biomedical entities, relations) for the relation classification. We compared the chosen Convolutional Neural Networks (CNN) based classification model with the most widely used learning algorithms. Results: Our CNN based classification model outperforms the most widely used supervised algorithms. We achieved a significant performance on binary classification with a weighted macro-average F1-score: 94.79% using pre-extracted relevant feature combinations. For multi-class classification, the weighted macro-average F1-score is estimated around 86.95%. Conclusions: Our results suggest that our proposed CNN based model using the not only single feature as the raw text of the sentences of biomedical literature, but also coupling with multiple and highlighted features extracted from the biomedical sentences could improve the classification performance significantly. We offer hyperparameter tuning and optimization approaches for our proposed model to obtain optimal hyperparameters of the models with the best performance.

AB - Background: Due to the advent of deep learning, the increasing number of studies in the biomedical domain has attracted much interest in feature extraction and classification tasks. In this research, we seek the best combination of feature set and hyperparameter setting of deep learning algorithms for relation classification. To this end, we incorporate an entity and relation extraction tool, PKDE4J to extract biomedical features (i.e., biomedical entities, relations) for the relation classification. We compared the chosen Convolutional Neural Networks (CNN) based classification model with the most widely used learning algorithms. Results: Our CNN based classification model outperforms the most widely used supervised algorithms. We achieved a significant performance on binary classification with a weighted macro-average F1-score: 94.79% using pre-extracted relevant feature combinations. For multi-class classification, the weighted macro-average F1-score is estimated around 86.95%. Conclusions: Our results suggest that our proposed CNN based model using the not only single feature as the raw text of the sentences of biomedical literature, but also coupling with multiple and highlighted features extracted from the biomedical sentences could improve the classification performance significantly. We offer hyperparameter tuning and optimization approaches for our proposed model to obtain optimal hyperparameters of the models with the best performance.

UR - http://www.scopus.com/inward/record.url?scp=85066327007&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066327007&partnerID=8YFLogxK

U2 - 10.1186/s12859-019-2808-3

DO - 10.1186/s12859-019-2808-3

M3 - Article

C2 - 31138159

AN - SCOPUS:85066327007

VL - 20

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 244

ER -