Combining active learning and semi-supervised learning techniques to extract protein interaction sentences

Min Song, Hwanjo Yu, Wook Shin Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Protein-protein interaction (PPI) extraction has been a focal point of many biomedical research and database curation tools. Both Active Learning and Semi-supervised SVMs have recently been applied to extract PPI automatically. In this paper, we explore integrating active learning approaches to semi-supervised SVMs with a NLP-driven feature selection technique. Our contributions in this paper are as follows: (a) We proposed a novel PPI extraction technique called PPISpotter by combining an active learning technique with semi-supervised SVMs to extract proteinprotein interaction. (b) We extracted a comprehensive set of features from MEDLINE records by Natural Language Processing (NLP) techniques for SVM classifiers. (c) We conducted experiments with three different PPI corpora and showed that PPISpotter is superior to four other comparison techniques in terms of precision, recall, and F-measure.

Original languageEnglish
Title of host publicationProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages35-44
Number of pages10
ISBN (Electronic)9781605583020
Publication statusPublished - 2010
Event9th International Workshop on Data Mining in Bioinformatics, BIOKDD 2010, Held in Conjunction with 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining - Washington, United States
Duration: 2010 Jul 252010 Jul 28

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

Other9th International Workshop on Data Mining in Bioinformatics, BIOKDD 2010, Held in Conjunction with 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Country/TerritoryUnited States
CityWashington
Period10/7/2510/7/28

Bibliographical note

Publisher Copyright:
Copyright © 2010 ACM.

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Combining active learning and semi-supervised learning techniques to extract protein interaction sentences'. Together they form a unique fingerprint.

Cite this