In lexicon-based sentiment classification, the problem of contextual polarity must be explicitly handled since it is a major cause for classification error. One way to handle contextual polarity is to revise the prior polarity of the sentiment dictionary to fit the domain. This paper presents a data-driven method of adapting sentiment dictionaries to diverse domains. Our method first merges multiple sentiment dictionaries at the entry word level to expand the dictionary. Then, leveraging the ratio of the positive/negative training data, it selectively removes the entry words that do not contribute to the classification. Finally, it selectively switches the sentiment polarity of the entry words to adapt to the domain. In essence, our method compares the positive/negative review's dictionary word occurrence ratios with the positive/negative review ratio itself to determine which entry words to be removed and which entry words' sentiment polarity to be switched. We show that the integrated sentiment dictionary constructed using our 'merge', 'remove', and 'switch' operations robustly outperforms individual dictionaries in the sentiment classification of product reviews across different domains such as smartphones, movies, and books.
Bibliographical noteFunding Information:
This research was supported by the Ministry of Science, ICT and Future Planning (MSIP), Korea , under the “IT Consilience Creative Program” (NIPA-2014-H0201-14-1002) supervised by the National IT Industry Promotion Agency (NIPA) . We would like to thank Julian Brooke and Maite Taboada for providing us with SO-CAL sentiment dictionaries.
© 2014 Elsevier B.V. All rights reserved.
All Science Journal Classification (ASJC) codes
- Management Information Systems
- Information Systems and Management
- Artificial Intelligence