Developing a supervised learning-based social media business sentiment index

Hyeonseo Lee, Nakyeong Lee, Harim Seo, Min Song

Research output: Contribution to journalArticle

Abstract

The fast-growing digital data generation leads to the emergence of the era of big data, which become particularly more valuable because approximately 70% of the collected data in the world comes from social media. Thus, the investigation of online social network services is of paramount importance. In this paper, we use the sentiment analysis, which detects attitudes and emotions toward issues of society posted in social media, to understand the actual economic situation. To this end, two steps are suggested. In the first step, after training the sentiment classifiers with several big data sources of social media datasets, we consider three types of feature sets: feature vector, sequence vector and a combination of dictionary-based feature and sequence vectors. Then, the performance of six classifiers is assessed: MaxEnt-L1, C4.5 decision tree, SVM-kernel, Ada-boost, Naïve Bayes and MaxEnt. In the second step, we collect datasets that are relevant to several economic words that the public use to explicitly express their opinions. Finally, we use a vector auto-regression analysis to confirm our hypothesis. The results show the statistically significant relationship between public sentiment and economic performance. That is, “depression” and “unemployment” lead to KOSPI. Also, it shows that the extracted keywords from the sentiment analysis, such as “price,” “year-end-tax” and “budget deficit,” cause the exchange rates.

Original languageEnglish
JournalJournal of Supercomputing
DOIs
Publication statusAccepted/In press - 2019 Jan 1

Fingerprint

Social Media
Supervised learning
Supervised Learning
Sentiment Analysis
Maximum Entropy
Economics
Industry
Classifiers
Classifier
Vector Autoregression
Unemployment
AdaBoost
Exchange rate
Tax
Bayes
Glossaries
Decision trees
Taxation
Feature Vector
Regression Analysis

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Information Systems
  • Hardware and Architecture

Cite this

@article{080c22ca15394316aec8e207dfbb5fbb,
title = "Developing a supervised learning-based social media business sentiment index",
abstract = "The fast-growing digital data generation leads to the emergence of the era of big data, which become particularly more valuable because approximately 70{\%} of the collected data in the world comes from social media. Thus, the investigation of online social network services is of paramount importance. In this paper, we use the sentiment analysis, which detects attitudes and emotions toward issues of society posted in social media, to understand the actual economic situation. To this end, two steps are suggested. In the first step, after training the sentiment classifiers with several big data sources of social media datasets, we consider three types of feature sets: feature vector, sequence vector and a combination of dictionary-based feature and sequence vectors. Then, the performance of six classifiers is assessed: MaxEnt-L1, C4.5 decision tree, SVM-kernel, Ada-boost, Na{\"i}ve Bayes and MaxEnt. In the second step, we collect datasets that are relevant to several economic words that the public use to explicitly express their opinions. Finally, we use a vector auto-regression analysis to confirm our hypothesis. The results show the statistically significant relationship between public sentiment and economic performance. That is, “depression” and “unemployment” lead to KOSPI. Also, it shows that the extracted keywords from the sentiment analysis, such as “price,” “year-end-tax” and “budget deficit,” cause the exchange rates.",
author = "Hyeonseo Lee and Nakyeong Lee and Harim Seo and Min Song",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s11227-018-02737-x",
language = "English",
journal = "Journal of Supercomputing",
issn = "0920-8542",
publisher = "Springer Netherlands",

}

Developing a supervised learning-based social media business sentiment index. / Lee, Hyeonseo; Lee, Nakyeong; Seo, Harim; Song, Min.

In: Journal of Supercomputing, 01.01.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Developing a supervised learning-based social media business sentiment index

AU - Lee, Hyeonseo

AU - Lee, Nakyeong

AU - Seo, Harim

AU - Song, Min

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The fast-growing digital data generation leads to the emergence of the era of big data, which become particularly more valuable because approximately 70% of the collected data in the world comes from social media. Thus, the investigation of online social network services is of paramount importance. In this paper, we use the sentiment analysis, which detects attitudes and emotions toward issues of society posted in social media, to understand the actual economic situation. To this end, two steps are suggested. In the first step, after training the sentiment classifiers with several big data sources of social media datasets, we consider three types of feature sets: feature vector, sequence vector and a combination of dictionary-based feature and sequence vectors. Then, the performance of six classifiers is assessed: MaxEnt-L1, C4.5 decision tree, SVM-kernel, Ada-boost, Naïve Bayes and MaxEnt. In the second step, we collect datasets that are relevant to several economic words that the public use to explicitly express their opinions. Finally, we use a vector auto-regression analysis to confirm our hypothesis. The results show the statistically significant relationship between public sentiment and economic performance. That is, “depression” and “unemployment” lead to KOSPI. Also, it shows that the extracted keywords from the sentiment analysis, such as “price,” “year-end-tax” and “budget deficit,” cause the exchange rates.

AB - The fast-growing digital data generation leads to the emergence of the era of big data, which become particularly more valuable because approximately 70% of the collected data in the world comes from social media. Thus, the investigation of online social network services is of paramount importance. In this paper, we use the sentiment analysis, which detects attitudes and emotions toward issues of society posted in social media, to understand the actual economic situation. To this end, two steps are suggested. In the first step, after training the sentiment classifiers with several big data sources of social media datasets, we consider three types of feature sets: feature vector, sequence vector and a combination of dictionary-based feature and sequence vectors. Then, the performance of six classifiers is assessed: MaxEnt-L1, C4.5 decision tree, SVM-kernel, Ada-boost, Naïve Bayes and MaxEnt. In the second step, we collect datasets that are relevant to several economic words that the public use to explicitly express their opinions. Finally, we use a vector auto-regression analysis to confirm our hypothesis. The results show the statistically significant relationship between public sentiment and economic performance. That is, “depression” and “unemployment” lead to KOSPI. Also, it shows that the extracted keywords from the sentiment analysis, such as “price,” “year-end-tax” and “budget deficit,” cause the exchange rates.

UR - http://www.scopus.com/inward/record.url?scp=85059850687&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059850687&partnerID=8YFLogxK

U2 - 10.1007/s11227-018-02737-x

DO - 10.1007/s11227-018-02737-x

M3 - Article

JO - Journal of Supercomputing

JF - Journal of Supercomputing

SN - 0920-8542

ER -