Overcoming asymmetry in entity graphs

Taesung Lee, Young Rok Cha, Seung Won Hwang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

This paper studies the problem of mining named entity translations by aligning comparable corpora. Current state-of-the-art approaches mine a translation pair by aligning an entity graph in one language to another based on node similarity or propagated similarity of related entities. However, they, building on the assumption of 'symmetry', quickly deteriorate on 'weakly' comparable corpora with some asymmetry. In this paper, we pursue two directions for overcoming relation and entity asymmetry respectively. The first approach starts from weakly comparable corpora (for high recall) then ensures precision by selective propagation only to entities of symmetric relations. The second approach starts from parallel corpora (for high precision) then enhances recall by extending the translation matrix based on node similarity and contextual similarity. Our experimental results on English-Chinese corpora show that both approaches are effective and complementary. Our combined approach outperforms the best-performing baseline in terms of F1-score by up to 0.28.

Original languageEnglish
Article number6945935
Pages (from-to)3051-3063
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume26
Issue number12
DOIs
Publication statusPublished - 2014 Dec 1

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

Lee, Taesung ; Cha, Young Rok ; Hwang, Seung Won. / Overcoming asymmetry in entity graphs. In: IEEE Transactions on Knowledge and Data Engineering. 2014 ; Vol. 26, No. 12. pp. 3051-3063.
@article{eedd7626a0254cd1936002e35a96b336,
title = "Overcoming asymmetry in entity graphs",
abstract = "This paper studies the problem of mining named entity translations by aligning comparable corpora. Current state-of-the-art approaches mine a translation pair by aligning an entity graph in one language to another based on node similarity or propagated similarity of related entities. However, they, building on the assumption of 'symmetry', quickly deteriorate on 'weakly' comparable corpora with some asymmetry. In this paper, we pursue two directions for overcoming relation and entity asymmetry respectively. The first approach starts from weakly comparable corpora (for high recall) then ensures precision by selective propagation only to entities of symmetric relations. The second approach starts from parallel corpora (for high precision) then enhances recall by extending the translation matrix based on node similarity and contextual similarity. Our experimental results on English-Chinese corpora show that both approaches are effective and complementary. Our combined approach outperforms the best-performing baseline in terms of F1-score by up to 0.28.",
author = "Taesung Lee and Cha, {Young Rok} and Hwang, {Seung Won}",
year = "2014",
month = "12",
day = "1",
doi = "10.1109/TKDE.2014.2316799",
language = "English",
volume = "26",
pages = "3051--3063",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "12",

}

Overcoming asymmetry in entity graphs. / Lee, Taesung; Cha, Young Rok; Hwang, Seung Won.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 12, 6945935, 01.12.2014, p. 3051-3063.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Overcoming asymmetry in entity graphs

AU - Lee, Taesung

AU - Cha, Young Rok

AU - Hwang, Seung Won

PY - 2014/12/1

Y1 - 2014/12/1

N2 - This paper studies the problem of mining named entity translations by aligning comparable corpora. Current state-of-the-art approaches mine a translation pair by aligning an entity graph in one language to another based on node similarity or propagated similarity of related entities. However, they, building on the assumption of 'symmetry', quickly deteriorate on 'weakly' comparable corpora with some asymmetry. In this paper, we pursue two directions for overcoming relation and entity asymmetry respectively. The first approach starts from weakly comparable corpora (for high recall) then ensures precision by selective propagation only to entities of symmetric relations. The second approach starts from parallel corpora (for high precision) then enhances recall by extending the translation matrix based on node similarity and contextual similarity. Our experimental results on English-Chinese corpora show that both approaches are effective and complementary. Our combined approach outperforms the best-performing baseline in terms of F1-score by up to 0.28.

AB - This paper studies the problem of mining named entity translations by aligning comparable corpora. Current state-of-the-art approaches mine a translation pair by aligning an entity graph in one language to another based on node similarity or propagated similarity of related entities. However, they, building on the assumption of 'symmetry', quickly deteriorate on 'weakly' comparable corpora with some asymmetry. In this paper, we pursue two directions for overcoming relation and entity asymmetry respectively. The first approach starts from weakly comparable corpora (for high recall) then ensures precision by selective propagation only to entities of symmetric relations. The second approach starts from parallel corpora (for high precision) then enhances recall by extending the translation matrix based on node similarity and contextual similarity. Our experimental results on English-Chinese corpora show that both approaches are effective and complementary. Our combined approach outperforms the best-performing baseline in terms of F1-score by up to 0.28.

UR - http://www.scopus.com/inward/record.url?scp=84910042166&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84910042166&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2014.2316799

DO - 10.1109/TKDE.2014.2316799

M3 - Article

VL - 26

SP - 3051

EP - 3063

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 12

M1 - 6945935

ER -