Shortest Path Edit Distance for detecting duplicate biological entities

Alex Rudniy, Min Song, James Geller

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper presents a novel and context-sensitive Shortest Path Edit Distance (SPED) applied to duplicate entity detection in biological data. SPED is an extension of Markov Random Field-based Edit Distance. It transforms the edit distance computational problem to the calculation of the shortest path among two selected vertices of a graph. The experimental results show that SPED produces competitive outcomes. Soft-SPED, the combination of SPED with TFIDF, achieves superior performance in most cases.

Original languageEnglish
Title of host publication2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
Pages442-444
Number of pages3
DOIs
Publication statusPublished - 2010 Oct 25
Event2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010 - Niagara Falls, NY, United States
Duration: 2010 Aug 22010 Aug 4

Publication series

Name2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010

Other

Other2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
CountryUnited States
CityNiagara Falls, NY
Period10/8/210/8/4

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Information Management

Cite this

Rudniy, A., Song, M., & Geller, J. (2010). Shortest Path Edit Distance for detecting duplicate biological entities. In 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010 (pp. 442-444). (2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010). https://doi.org/10.1145/1854776.1854851