TY - GEN
T1 - Document clustering by semantic smoothing and Dynamic Growing Cell Structure (DynGCS) for biomedical literature
AU - Song, Min
AU - Hu, Xiaohua
AU - Yoo, Illhoi
AU - Koppel, Eric
PY - 2008
Y1 - 2008
N2 - The general goal of clustering is to group data elements such that the intra-group similarities are high and the inter-group similarities are low. In this paper, we propose a novel hybrid clustering technique that incorporates semantic smoothing of document models into a neural network framework. Recently it has been reported that the semantic smoothing model enhances the retrieval quality in Information Retrieval (IR). Inspired by that, we apply the context-sensitive semantic smoothing model to boost accuracy of clustering that is generated by a dynamic growing cell structure algorithm, a variation of the neural network technique. We evaluated the proposed technique on article sets from MEDLINE, the largest biomedical digital library in Biomedicine. Our experimental evaluations show that the proposed algorithm significantly improves the clustering quality over the traditional clustering techniques.
AB - The general goal of clustering is to group data elements such that the intra-group similarities are high and the inter-group similarities are low. In this paper, we propose a novel hybrid clustering technique that incorporates semantic smoothing of document models into a neural network framework. Recently it has been reported that the semantic smoothing model enhances the retrieval quality in Information Retrieval (IR). Inspired by that, we apply the context-sensitive semantic smoothing model to boost accuracy of clustering that is generated by a dynamic growing cell structure algorithm, a variation of the neural network technique. We evaluated the proposed technique on article sets from MEDLINE, the largest biomedical digital library in Biomedicine. Our experimental evaluations show that the proposed algorithm significantly improves the clustering quality over the traditional clustering techniques.
UR - http://www.scopus.com/inward/record.url?scp=52949098384&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=52949098384&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-85836-2_21
DO - 10.1007/978-3-540-85836-2_21
M3 - Conference contribution
AN - SCOPUS:52949098384
SN - 3540858350
SN - 9783540858355
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 217
EP - 226
BT - Data Warehousing and Knowledge Discovery - 10th International Conference, DaWaK 2008, Proceedings
T2 - 10th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2008
Y2 - 2 September 2008 through 5 September 2008
ER -