Semisupervised sentiment analysis method for online text reviews

Gyeong Taek Lee, Chang Ouk Kim, Min Song

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Sentiment analysis plays an important role in understanding individual opinions expressed in websites such as social media and product review sites. The common approaches to sentiment analysis use the sentiments carried by words that express opinions and are based on either supervised or unsupervised learning techniques. The unsupervised learning approach builds a word-sentiment dictionary, but it requires lengthy time periods and high costs to build a reliable dictionary. The supervised learning approach uses machine learning models to learn the sentiment scores of words; however, training a classifier model requires large amounts of labelled text data to achieve a good performance. In this article, we propose a semisupervised approach that performs well despite having only small amounts of labelled data available for training. The proposed method builds a base sentiment dictionary from a small training dataset using a lasso-based ensemble model with minimal human effort. The scores of words not in the training dataset are estimated using an adaptive instance-based learning model. In a pretrained word2vec model space, the sentiment values of the words in the dictionary are propagated to the words that did not exist in the training dataset. Through two experiments, we demonstrate that the performance of the proposed method is comparable to that of supervised learning models trained on large datasets.

Original languageEnglish
Pages (from-to)387-403
Number of pages17
JournalJournal of Information Science
Volume47
Issue number3
DOIs
Publication statusPublished - 2021 Jun

Bibliographical note

Funding Information:
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (No. NRF-2018S1A3A2075114).

Funding Information:
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (No. NRF-2018S1A3A2075114).

Publisher Copyright:
© The Author(s) 2020.

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Semisupervised sentiment analysis method for online text reviews'. Together they form a unique fingerprint.

Cite this