SocialSearch: Enhancing entity search with social network matching

Gae Won You, Seung Won Hwang, Zaiqing Nie, Ji Rong Wen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

This paper introduces the problem of matching people names to their corresponding social network identities such as their Twitter accounts. Existing tools for this purpose build upon naive textual matching and inevitably suffer low precision, due to false positives (e.g., fake impersonator accounts) and false negatives (e.g., accounts using nicknames). To overcome these limitations, we leverage "relational" evidences extracted from the Web corpus. In particular, as such an example, weadopt Web document co-occurrences, which can be interpreted as an "implicit" counterpart of Twitter follower relationships. Using both textual and relational features, we learn a ranking function aggregating these features for the accurate ordering of candidate matches. Another key contribution of this paper is to formulate confidence scoring as a separate problem from relevance ranking. A baseline approach is to use the relevance of the top match itself as the confidence score. In contrast, we train a separate classifier, using not only the top relevance score but also various statistical features extracted from the relevance scores of all candidates, and empirically validate to outperform the baseline approach. We evaluate our proposed system using real-life internetscale entity-relationship and social network graphs.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2011
Subtitle of host publication14th International Conference on Extending Database Technology, Proceedings
Pages515-520
Number of pages6
DOIs
Publication statusPublished - 2011 Apr 18
Event14th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2011 - Uppsala, Sweden
Duration: 2011 Mar 222011 Mar 24

Publication series

NameACM International Conference Proceeding Series

Other

Other14th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2011
CountrySweden
CityUppsala
Period11/3/2211/3/24

Fingerprint

Classifiers

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

You, G. W., Hwang, S. W., Nie, Z., & Wen, J. R. (2011). SocialSearch: Enhancing entity search with social network matching. In Advances in Database Technology - EDBT 2011: 14th International Conference on Extending Database Technology, Proceedings (pp. 515-520). (ACM International Conference Proceeding Series). https://doi.org/10.1145/1951365.1951428
You, Gae Won ; Hwang, Seung Won ; Nie, Zaiqing ; Wen, Ji Rong. / SocialSearch : Enhancing entity search with social network matching. Advances in Database Technology - EDBT 2011: 14th International Conference on Extending Database Technology, Proceedings. 2011. pp. 515-520 (ACM International Conference Proceeding Series).
@inproceedings{450afe80ca5d4c6d821e1946405507ac,
title = "SocialSearch: Enhancing entity search with social network matching",
abstract = "This paper introduces the problem of matching people names to their corresponding social network identities such as their Twitter accounts. Existing tools for this purpose build upon naive textual matching and inevitably suffer low precision, due to false positives (e.g., fake impersonator accounts) and false negatives (e.g., accounts using nicknames). To overcome these limitations, we leverage {"}relational{"} evidences extracted from the Web corpus. In particular, as such an example, weadopt Web document co-occurrences, which can be interpreted as an {"}implicit{"} counterpart of Twitter follower relationships. Using both textual and relational features, we learn a ranking function aggregating these features for the accurate ordering of candidate matches. Another key contribution of this paper is to formulate confidence scoring as a separate problem from relevance ranking. A baseline approach is to use the relevance of the top match itself as the confidence score. In contrast, we train a separate classifier, using not only the top relevance score but also various statistical features extracted from the relevance scores of all candidates, and empirically validate to outperform the baseline approach. We evaluate our proposed system using real-life internetscale entity-relationship and social network graphs.",
author = "You, {Gae Won} and Hwang, {Seung Won} and Zaiqing Nie and Wen, {Ji Rong}",
year = "2011",
month = "4",
day = "18",
doi = "10.1145/1951365.1951428",
language = "English",
isbn = "9781450305280",
series = "ACM International Conference Proceeding Series",
pages = "515--520",
booktitle = "Advances in Database Technology - EDBT 2011",

}

You, GW, Hwang, SW, Nie, Z & Wen, JR 2011, SocialSearch: Enhancing entity search with social network matching. in Advances in Database Technology - EDBT 2011: 14th International Conference on Extending Database Technology, Proceedings. ACM International Conference Proceeding Series, pp. 515-520, 14th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2011, Uppsala, Sweden, 11/3/22. https://doi.org/10.1145/1951365.1951428

SocialSearch : Enhancing entity search with social network matching. / You, Gae Won; Hwang, Seung Won; Nie, Zaiqing; Wen, Ji Rong.

Advances in Database Technology - EDBT 2011: 14th International Conference on Extending Database Technology, Proceedings. 2011. p. 515-520 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - SocialSearch

T2 - Enhancing entity search with social network matching

AU - You, Gae Won

AU - Hwang, Seung Won

AU - Nie, Zaiqing

AU - Wen, Ji Rong

PY - 2011/4/18

Y1 - 2011/4/18

N2 - This paper introduces the problem of matching people names to their corresponding social network identities such as their Twitter accounts. Existing tools for this purpose build upon naive textual matching and inevitably suffer low precision, due to false positives (e.g., fake impersonator accounts) and false negatives (e.g., accounts using nicknames). To overcome these limitations, we leverage "relational" evidences extracted from the Web corpus. In particular, as such an example, weadopt Web document co-occurrences, which can be interpreted as an "implicit" counterpart of Twitter follower relationships. Using both textual and relational features, we learn a ranking function aggregating these features for the accurate ordering of candidate matches. Another key contribution of this paper is to formulate confidence scoring as a separate problem from relevance ranking. A baseline approach is to use the relevance of the top match itself as the confidence score. In contrast, we train a separate classifier, using not only the top relevance score but also various statistical features extracted from the relevance scores of all candidates, and empirically validate to outperform the baseline approach. We evaluate our proposed system using real-life internetscale entity-relationship and social network graphs.

AB - This paper introduces the problem of matching people names to their corresponding social network identities such as their Twitter accounts. Existing tools for this purpose build upon naive textual matching and inevitably suffer low precision, due to false positives (e.g., fake impersonator accounts) and false negatives (e.g., accounts using nicknames). To overcome these limitations, we leverage "relational" evidences extracted from the Web corpus. In particular, as such an example, weadopt Web document co-occurrences, which can be interpreted as an "implicit" counterpart of Twitter follower relationships. Using both textual and relational features, we learn a ranking function aggregating these features for the accurate ordering of candidate matches. Another key contribution of this paper is to formulate confidence scoring as a separate problem from relevance ranking. A baseline approach is to use the relevance of the top match itself as the confidence score. In contrast, we train a separate classifier, using not only the top relevance score but also various statistical features extracted from the relevance scores of all candidates, and empirically validate to outperform the baseline approach. We evaluate our proposed system using real-life internetscale entity-relationship and social network graphs.

UR - http://www.scopus.com/inward/record.url?scp=79953885743&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79953885743&partnerID=8YFLogxK

U2 - 10.1145/1951365.1951428

DO - 10.1145/1951365.1951428

M3 - Conference contribution

AN - SCOPUS:79953885743

SN - 9781450305280

T3 - ACM International Conference Proceeding Series

SP - 515

EP - 520

BT - Advances in Database Technology - EDBT 2011

ER -

You GW, Hwang SW, Nie Z, Wen JR. SocialSearch: Enhancing entity search with social network matching. In Advances in Database Technology - EDBT 2011: 14th International Conference on Extending Database Technology, Proceedings. 2011. p. 515-520. (ACM International Conference Proceeding Series). https://doi.org/10.1145/1951365.1951428