Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews

Heeryon Cho, Songkuk Kim, Jongseo Lee, Jong Seok Lee

Research output: Contribution to journalArticlepeer-review

82 Citations (Scopus)


In lexicon-based sentiment classification, the problem of contextual polarity must be explicitly handled since it is a major cause for classification error. One way to handle contextual polarity is to revise the prior polarity of the sentiment dictionary to fit the domain. This paper presents a data-driven method of adapting sentiment dictionaries to diverse domains. Our method first merges multiple sentiment dictionaries at the entry word level to expand the dictionary. Then, leveraging the ratio of the positive/negative training data, it selectively removes the entry words that do not contribute to the classification. Finally, it selectively switches the sentiment polarity of the entry words to adapt to the domain. In essence, our method compares the positive/negative review's dictionary word occurrence ratios with the positive/negative review ratio itself to determine which entry words to be removed and which entry words' sentiment polarity to be switched. We show that the integrated sentiment dictionary constructed using our 'merge', 'remove', and 'switch' operations robustly outperforms individual dictionaries in the sentiment classification of product reviews across different domains such as smartphones, movies, and books.

Original languageEnglish
Pages (from-to)61-71
Number of pages11
JournalKnowledge-Based Systems
Publication statusPublished - 2014 Nov 1

Bibliographical note

Funding Information:
This research was supported by the Ministry of Science, ICT and Future Planning (MSIP), Korea , under the “IT Consilience Creative Program” (NIPA-2014-H0201-14-1002) supervised by the National IT Industry Promotion Agency (NIPA) . We would like to thank Julian Brooke and Maite Taboada for providing us with SO-CAL sentiment dictionaries.

Publisher Copyright:
© 2014 Elsevier B.V. All rights reserved.

All Science Journal Classification (ASJC) codes

  • Software
  • Management Information Systems
  • Information Systems and Management
  • Artificial Intelligence


Dive into the research topics of 'Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews'. Together they form a unique fingerprint.

Cite this