Dynamic topic detection and tracking

A comparison of HDP, C-word, and cocitation methods

Wanying Ding, Chaomei Chen

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

Cocitation and co-word methods have long been used to detect and track emerging topics in scientific literature, but both have weaknesses. Recently, while many researchers have adopted generative probabilistic models for topic detection and tracking, few have compared generative probabilistic models with traditional cocitation and co-word methods in terms of their overall performance. In this article, we compare the performance of hierarchical Dirichlet process (HDP), a promising generative probabilistic model, with that of the 2 traditional topic detecting and tracking methods - cocitation analysis and co-word analysis. We visualize and explore the relationships between topics identified by the 3 methods in hierarchical edge bundling graphs and time flow graphs. Our result shows that HDP is more sensitive and reliable than the other 2 methods in both detecting and tracking emerging topics. Furthermore, we demonstrate the important topics and topic evolution trends in the literature of terrorism research with the HDP method.

Original languageEnglish
Pages (from-to)2084-2097
Number of pages14
JournalJournal of the Association for Information Science and Technology
Volume65
Issue number10
DOIs
Publication statusPublished - 2014 Oct 1

Fingerprint

Flow graphs
Terrorism
technical literature
performance
Statistical Models
Co-citation
Dirichlet process
terrorism
trend
Probabilistic model
Graph
literature
time
Flow time
Bundling
Co-citation analysis

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management
  • Library and Information Sciences

Cite this

@article{dc6d17c4ed0642899e8e28e70409ecdb,
title = "Dynamic topic detection and tracking: A comparison of HDP, C-word, and cocitation methods",
abstract = "Cocitation and co-word methods have long been used to detect and track emerging topics in scientific literature, but both have weaknesses. Recently, while many researchers have adopted generative probabilistic models for topic detection and tracking, few have compared generative probabilistic models with traditional cocitation and co-word methods in terms of their overall performance. In this article, we compare the performance of hierarchical Dirichlet process (HDP), a promising generative probabilistic model, with that of the 2 traditional topic detecting and tracking methods - cocitation analysis and co-word analysis. We visualize and explore the relationships between topics identified by the 3 methods in hierarchical edge bundling graphs and time flow graphs. Our result shows that HDP is more sensitive and reliable than the other 2 methods in both detecting and tracking emerging topics. Furthermore, we demonstrate the important topics and topic evolution trends in the literature of terrorism research with the HDP method.",
author = "Wanying Ding and Chaomei Chen",
year = "2014",
month = "10",
day = "1",
doi = "10.1002/asi.23134",
language = "English",
volume = "65",
pages = "2084--2097",
journal = "Journal of the Association for Information Science and Technology",
issn = "2330-1635",
publisher = "John Wiley and Sons Ltd",
number = "10",

}

Dynamic topic detection and tracking : A comparison of HDP, C-word, and cocitation methods. / Ding, Wanying; Chen, Chaomei.

In: Journal of the Association for Information Science and Technology, Vol. 65, No. 10, 01.10.2014, p. 2084-2097.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Dynamic topic detection and tracking

T2 - A comparison of HDP, C-word, and cocitation methods

AU - Ding, Wanying

AU - Chen, Chaomei

PY - 2014/10/1

Y1 - 2014/10/1

N2 - Cocitation and co-word methods have long been used to detect and track emerging topics in scientific literature, but both have weaknesses. Recently, while many researchers have adopted generative probabilistic models for topic detection and tracking, few have compared generative probabilistic models with traditional cocitation and co-word methods in terms of their overall performance. In this article, we compare the performance of hierarchical Dirichlet process (HDP), a promising generative probabilistic model, with that of the 2 traditional topic detecting and tracking methods - cocitation analysis and co-word analysis. We visualize and explore the relationships between topics identified by the 3 methods in hierarchical edge bundling graphs and time flow graphs. Our result shows that HDP is more sensitive and reliable than the other 2 methods in both detecting and tracking emerging topics. Furthermore, we demonstrate the important topics and topic evolution trends in the literature of terrorism research with the HDP method.

AB - Cocitation and co-word methods have long been used to detect and track emerging topics in scientific literature, but both have weaknesses. Recently, while many researchers have adopted generative probabilistic models for topic detection and tracking, few have compared generative probabilistic models with traditional cocitation and co-word methods in terms of their overall performance. In this article, we compare the performance of hierarchical Dirichlet process (HDP), a promising generative probabilistic model, with that of the 2 traditional topic detecting and tracking methods - cocitation analysis and co-word analysis. We visualize and explore the relationships between topics identified by the 3 methods in hierarchical edge bundling graphs and time flow graphs. Our result shows that HDP is more sensitive and reliable than the other 2 methods in both detecting and tracking emerging topics. Furthermore, we demonstrate the important topics and topic evolution trends in the literature of terrorism research with the HDP method.

UR - http://www.scopus.com/inward/record.url?scp=84930400661&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84930400661&partnerID=8YFLogxK

U2 - 10.1002/asi.23134

DO - 10.1002/asi.23134

M3 - Article

VL - 65

SP - 2084

EP - 2097

JO - Journal of the Association for Information Science and Technology

JF - Journal of the Association for Information Science and Technology

SN - 2330-1635

IS - 10

ER -