Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter

Wanying Ding, Yue Zhang, Chaomei Chen, Xiaohua Hu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Understanding ongoing topics and their evolutions in social media is of great importance. Although topic analysis is not a novel research question, social media environment has presented new challenges. First, with insufficient co-occurrence information, short text have undermined many word co-occurrence oriented topic models' applicability. Second, real time message streams make traditional discretized topic tracking methods hard to function. Third, topics' evolution mechanisms are of great importance in social media context, but many studies have ignored them. Forth, topics have more complicated correlation among each other. Considering the existing problems, this paper has proposed a Semi-Supervised Dirichlet-Hawkes Process (SDHP) to deal with topic detection and tracking from social media. The main contributions of this paper are reflected in: (1) SDHP can handle short text problem efficiently; (2) SDHP can track topics from continuous message stream; (3) SDHP can reveal topics' underlying evolution patterns; and (4) SDHP can capture topics' correlations We have evaluated SDHP's ability in both topic detection and tracking in 8 real datasets from Twitter, and the algorithm's performances are very promising.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
EditorsRonay Ak, George Karypis, Yinglong Xia, Xiaohua Tony Hu, Philip S. Yu, James Joshi, Lyle Ungar, Ling Liu, Aki-Hiro Sato, Toyotaro Suzumura, Sudarsan Rachuri, Rama Govindaraju, Weijia Xu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages869-874
Number of pages6
ISBN (Electronic)9781467390040
DOIs
Publication statusPublished - 2016 Jan 1
Event4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States
Duration: 2016 Dec 52016 Dec 8

Other

Other4th IEEE International Conference on Big Data, Big Data 2016
CountryUnited States
CityWashington
Period16/12/516/12/8

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture

Cite this

Ding, W., Zhang, Y., Chen, C., & Hu, X. (2016). Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter. In R. Ak, G. Karypis, Y. Xia, X. T. Hu, P. S. Yu, J. Joshi, L. Ungar, L. Liu, A-H. Sato, T. Suzumura, S. Rachuri, R. Govindaraju, ... W. Xu (Eds.), Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 (pp. 869-874). [7840680] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2016.7840680
Ding, Wanying ; Zhang, Yue ; Chen, Chaomei ; Hu, Xiaohua. / Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter. Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. editor / Ronay Ak ; George Karypis ; Yinglong Xia ; Xiaohua Tony Hu ; Philip S. Yu ; James Joshi ; Lyle Ungar ; Ling Liu ; Aki-Hiro Sato ; Toyotaro Suzumura ; Sudarsan Rachuri ; Rama Govindaraju ; Weijia Xu. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 869-874
@inproceedings{5590cd2444f246c2acb7854dd517d9f5,
title = "Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter",
abstract = "Understanding ongoing topics and their evolutions in social media is of great importance. Although topic analysis is not a novel research question, social media environment has presented new challenges. First, with insufficient co-occurrence information, short text have undermined many word co-occurrence oriented topic models' applicability. Second, real time message streams make traditional discretized topic tracking methods hard to function. Third, topics' evolution mechanisms are of great importance in social media context, but many studies have ignored them. Forth, topics have more complicated correlation among each other. Considering the existing problems, this paper has proposed a Semi-Supervised Dirichlet-Hawkes Process (SDHP) to deal with topic detection and tracking from social media. The main contributions of this paper are reflected in: (1) SDHP can handle short text problem efficiently; (2) SDHP can track topics from continuous message stream; (3) SDHP can reveal topics' underlying evolution patterns; and (4) SDHP can capture topics' correlations We have evaluated SDHP's ability in both topic detection and tracking in 8 real datasets from Twitter, and the algorithm's performances are very promising.",
author = "Wanying Ding and Yue Zhang and Chaomei Chen and Xiaohua Hu",
year = "2016",
month = "1",
day = "1",
doi = "10.1109/BigData.2016.7840680",
language = "English",
pages = "869--874",
editor = "Ronay Ak and George Karypis and Yinglong Xia and Hu, {Xiaohua Tony} and Yu, {Philip S.} and James Joshi and Lyle Ungar and Ling Liu and Aki-Hiro Sato and Toyotaro Suzumura and Sudarsan Rachuri and Rama Govindaraju and Weijia Xu",
booktitle = "Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Ding, W, Zhang, Y, Chen, C & Hu, X 2016, Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter. in R Ak, G Karypis, Y Xia, XT Hu, PS Yu, J Joshi, L Ungar, L Liu, A-H Sato, T Suzumura, S Rachuri, R Govindaraju & W Xu (eds), Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016., 7840680, Institute of Electrical and Electronics Engineers Inc., pp. 869-874, 4th IEEE International Conference on Big Data, Big Data 2016, Washington, United States, 16/12/5. https://doi.org/10.1109/BigData.2016.7840680

Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter. / Ding, Wanying; Zhang, Yue; Chen, Chaomei; Hu, Xiaohua.

Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. ed. / Ronay Ak; George Karypis; Yinglong Xia; Xiaohua Tony Hu; Philip S. Yu; James Joshi; Lyle Ungar; Ling Liu; Aki-Hiro Sato; Toyotaro Suzumura; Sudarsan Rachuri; Rama Govindaraju; Weijia Xu. Institute of Electrical and Electronics Engineers Inc., 2016. p. 869-874 7840680.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter

AU - Ding, Wanying

AU - Zhang, Yue

AU - Chen, Chaomei

AU - Hu, Xiaohua

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Understanding ongoing topics and their evolutions in social media is of great importance. Although topic analysis is not a novel research question, social media environment has presented new challenges. First, with insufficient co-occurrence information, short text have undermined many word co-occurrence oriented topic models' applicability. Second, real time message streams make traditional discretized topic tracking methods hard to function. Third, topics' evolution mechanisms are of great importance in social media context, but many studies have ignored them. Forth, topics have more complicated correlation among each other. Considering the existing problems, this paper has proposed a Semi-Supervised Dirichlet-Hawkes Process (SDHP) to deal with topic detection and tracking from social media. The main contributions of this paper are reflected in: (1) SDHP can handle short text problem efficiently; (2) SDHP can track topics from continuous message stream; (3) SDHP can reveal topics' underlying evolution patterns; and (4) SDHP can capture topics' correlations We have evaluated SDHP's ability in both topic detection and tracking in 8 real datasets from Twitter, and the algorithm's performances are very promising.

AB - Understanding ongoing topics and their evolutions in social media is of great importance. Although topic analysis is not a novel research question, social media environment has presented new challenges. First, with insufficient co-occurrence information, short text have undermined many word co-occurrence oriented topic models' applicability. Second, real time message streams make traditional discretized topic tracking methods hard to function. Third, topics' evolution mechanisms are of great importance in social media context, but many studies have ignored them. Forth, topics have more complicated correlation among each other. Considering the existing problems, this paper has proposed a Semi-Supervised Dirichlet-Hawkes Process (SDHP) to deal with topic detection and tracking from social media. The main contributions of this paper are reflected in: (1) SDHP can handle short text problem efficiently; (2) SDHP can track topics from continuous message stream; (3) SDHP can reveal topics' underlying evolution patterns; and (4) SDHP can capture topics' correlations We have evaluated SDHP's ability in both topic detection and tracking in 8 real datasets from Twitter, and the algorithm's performances are very promising.

UR - http://www.scopus.com/inward/record.url?scp=85015168292&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85015168292&partnerID=8YFLogxK

U2 - 10.1109/BigData.2016.7840680

DO - 10.1109/BigData.2016.7840680

M3 - Conference contribution

SP - 869

EP - 874

BT - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016

A2 - Ak, Ronay

A2 - Karypis, George

A2 - Xia, Yinglong

A2 - Hu, Xiaohua Tony

A2 - Yu, Philip S.

A2 - Joshi, James

A2 - Ungar, Lyle

A2 - Liu, Ling

A2 - Sato, Aki-Hiro

A2 - Suzumura, Toyotaro

A2 - Rachuri, Sudarsan

A2 - Govindaraju, Rama

A2 - Xu, Weijia

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Ding W, Zhang Y, Chen C, Hu X. Semi-supervised Dirichlet-Hawkes process with applications of topic detection and tracking in Twitter. In Ak R, Karypis G, Xia Y, Hu XT, Yu PS, Joshi J, Ungar L, Liu L, Sato A-H, Suzumura T, Rachuri S, Govindaraju R, Xu W, editors, Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 869-874. 7840680 https://doi.org/10.1109/BigData.2016.7840680