Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Sangheon Lee, Wooju Kim

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)


In recent decades, analyzing the sentiments in online customer reviews has become important to many businesses and researchers. However, insufficient amount of labeled training corpus is a bottleneck for machine learning approaches. Self-training is one of the promising semi-supervised techniques which does not require large amounts of labeled data. However, self-training also suffers from an incorrect labeling problem along with insufficient amount of labeled data. This study proposed a semi-supervised learning framework that adds only confidently predicted data to the training corpus in order to enrich the initial classifier in self-training. The experimental results indicate that the proposed method performed better than self-training.

Original languageEnglish
Pages (from-to)35-49
Number of pages15
JournalElectronic Commerce Research and Applications
Publication statusPublished - 2017 Nov

Bibliographical note

Funding Information:
This work (Grant No. C0258734) was supported by the Business for Academic-industrial Cooperative Establishments funded by the Korea Small and Medium Business Administration in 2016. This work was also financially supported by the Korea Ministry of Land, Infrastructure and Transport ( MOLIT ) via the U-City Master and Doctor Course Grant Program.

Publisher Copyright:
© 2017 Elsevier B.V.

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Computer Networks and Communications
  • Marketing
  • Management of Technology and Innovation


Dive into the research topics of 'Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification'. Together they form a unique fingerprint.

Cite this