CGM: A biomedical text categorization approach using concept graph mining

Said Bleik, Min Song, Aaron Smalter, Jun Huan, Gerald Lushington

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.

Original languageEnglish
Title of host publicationProceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009
Pages38-43
Number of pages6
DOIs
Publication statusPublished - 2009 Dec 1
Event2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009 - Washington, DC, United States
Duration: 2009 Nov 12009 Nov 4

Publication series

NameProceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009

Other

Other2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009
CountryUnited States
CityWashington, DC
Period09/11/109/11/4

Fingerprint

Databases
Direction compound

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Cite this

Bleik, S., Song, M., Smalter, A., Huan, J., & Lushington, G. (2009). CGM: A biomedical text categorization approach using concept graph mining. In Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009 (pp. 38-43). [5332134] (Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009). https://doi.org/10.1109/BIBMW.2009.5332134
Bleik, Said ; Song, Min ; Smalter, Aaron ; Huan, Jun ; Lushington, Gerald. / CGM : A biomedical text categorization approach using concept graph mining. Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009. 2009. pp. 38-43 (Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009).
@inproceedings{03618e6d0f6d4bda9d1c46c25bf3f9c2,
title = "CGM: A biomedical text categorization approach using concept graph mining",
abstract = "Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.",
author = "Said Bleik and Min Song and Aaron Smalter and Jun Huan and Gerald Lushington",
year = "2009",
month = "12",
day = "1",
doi = "10.1109/BIBMW.2009.5332134",
language = "English",
isbn = "9781424451210",
series = "Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009",
pages = "38--43",
booktitle = "Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009",

}

Bleik, S, Song, M, Smalter, A, Huan, J & Lushington, G 2009, CGM: A biomedical text categorization approach using concept graph mining. in Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009., 5332134, Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009, pp. 38-43, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009, Washington, DC, United States, 09/11/1. https://doi.org/10.1109/BIBMW.2009.5332134

CGM : A biomedical text categorization approach using concept graph mining. / Bleik, Said; Song, Min; Smalter, Aaron; Huan, Jun; Lushington, Gerald.

Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009. 2009. p. 38-43 5332134 (Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - CGM

T2 - A biomedical text categorization approach using concept graph mining

AU - Bleik, Said

AU - Song, Min

AU - Smalter, Aaron

AU - Huan, Jun

AU - Lushington, Gerald

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.

AB - Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.

UR - http://www.scopus.com/inward/record.url?scp=72849119302&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72849119302&partnerID=8YFLogxK

U2 - 10.1109/BIBMW.2009.5332134

DO - 10.1109/BIBMW.2009.5332134

M3 - Conference contribution

AN - SCOPUS:72849119302

SN - 9781424451210

T3 - Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009

SP - 38

EP - 43

BT - Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009

ER -

Bleik S, Song M, Smalter A, Huan J, Lushington G. CGM: A biomedical text categorization approach using concept graph mining. In Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009. 2009. p. 38-43. 5332134. (Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009). https://doi.org/10.1109/BIBMW.2009.5332134