Protein-protein interaction (PPI) extraction has been a focal point of many biomedical research and database curation tools. Both Active Learning and Semi-supervised SVMs have recently been applied to extract PPI automatically. In this paper, we explore integrating active learning approaches to semi-supervised SVMs with a NLP-driven feature selection technique. Our contributions in this paper are as follows: (a) We proposed a novel PPI extraction technique called PPISpotter by combining an active learning technique with semi-supervised SVMs to extract proteinprotein interaction. (b) We extracted a comprehensive set of features from MEDLINE records by Natural Language Processing (NLP) techniques for SVM classifiers. (c) We conducted experiments with three different PPI corpora and showed that PPISpotter is superior to four other comparison techniques in terms of precision, recall, and F-measure.
|Title of host publication||Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining|
|Publisher||Association for Computing Machinery|
|Number of pages||10|
|Publication status||Published - 2010|
|Event||9th International Workshop on Data Mining in Bioinformatics, BIOKDD 2010, Held in Conjunction with 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining - Washington, United States|
Duration: 2010 Jul 25 → 2010 Jul 28
|Name||Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining|
|Other||9th International Workshop on Data Mining in Bioinformatics, BIOKDD 2010, Held in Conjunction with 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining|
|Period||10/7/25 → 10/7/28|
Bibliographical notePublisher Copyright:
Copyright © 2010 ACM.
All Science Journal Classification (ASJC) codes
- Information Systems