Visual analysis of conflicting opinions

Chaomei Chen, Fidelia Ibekwe-SanJuan, Eric SanJuan, Chris Weaver

Research output: Chapter in Book/Report/Conference proceedingConference contribution

48 Citations (Scopus)

Abstract

Understanding the nature and dynamics of conflicting opinions is a profound and challenging issue. In this paper we address several aspects of the issue through a study of more than 3,000 Amazon customer reviews of the controversial bestseller The Da Vinci Code, including 1,738 positive and 918 negative reviews. The study is motivated by critical questions such as: What are the differences between positive and negative reviews? What is the origin of a particular opinion? How do these opinions change over time? To what extent can differentiating features be identified from unstructured text? How accurately can these features predict the category of a review? We first analyze terminology variations in these reviews in terms of syntactic, semantic, and statistic associations identified by TermWatch and use term variation patterns to depict underlying topics. We then select the most predictive terms based on log likelihood tests and demonstrate that this small set of terms classifies over 70% of the conflicting reviews correctly. This feature selection process reduces the dimensionality of the feature space from more than 20,000 dimensions to a couple of hundreds. We utilize automatically generated decision trees to facilitate the understanding of conflicting opinions in terms of these highly predictive terms. This study also uses a number of visualization and modeling tools to identify not only what positive and negative reviews have in common, but also they differ and evolve over time.

Original languageEnglish
Title of host publicationIEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings
Pages59-66
Number of pages8
DOIs
Publication statusPublished - 2006 Dec 1
EventIEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Baltimore, MD, United States
Duration: 2006 Oct 312006 Nov 2

Publication series

NameIEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings

Other

OtherIEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006
CountryUnited States
CityBaltimore, MD,
Period06/10/3106/11/2

Fingerprint

terminology
semantics
statistics
Syntactics
Terminology
Decision trees
Feature extraction
Visualization
Semantics
Statistics

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering
  • Atomic and Molecular Physics, and Optics

Cite this

Chen, C., Ibekwe-SanJuan, F., SanJuan, E., & Weaver, C. (2006). Visual analysis of conflicting opinions. In IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings (pp. 59-66). [4035748] (IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings). https://doi.org/10.1109/VAST.2006.261431
Chen, Chaomei ; Ibekwe-SanJuan, Fidelia ; SanJuan, Eric ; Weaver, Chris. / Visual analysis of conflicting opinions. IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings. 2006. pp. 59-66 (IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings).
@inproceedings{91edd477e80f404998a45054a4943e32,
title = "Visual analysis of conflicting opinions",
abstract = "Understanding the nature and dynamics of conflicting opinions is a profound and challenging issue. In this paper we address several aspects of the issue through a study of more than 3,000 Amazon customer reviews of the controversial bestseller The Da Vinci Code, including 1,738 positive and 918 negative reviews. The study is motivated by critical questions such as: What are the differences between positive and negative reviews? What is the origin of a particular opinion? How do these opinions change over time? To what extent can differentiating features be identified from unstructured text? How accurately can these features predict the category of a review? We first analyze terminology variations in these reviews in terms of syntactic, semantic, and statistic associations identified by TermWatch and use term variation patterns to depict underlying topics. We then select the most predictive terms based on log likelihood tests and demonstrate that this small set of terms classifies over 70{\%} of the conflicting reviews correctly. This feature selection process reduces the dimensionality of the feature space from more than 20,000 dimensions to a couple of hundreds. We utilize automatically generated decision trees to facilitate the understanding of conflicting opinions in terms of these highly predictive terms. This study also uses a number of visualization and modeling tools to identify not only what positive and negative reviews have in common, but also they differ and evolve over time.",
author = "Chaomei Chen and Fidelia Ibekwe-SanJuan and Eric SanJuan and Chris Weaver",
year = "2006",
month = "12",
day = "1",
doi = "10.1109/VAST.2006.261431",
language = "English",
isbn = "1424405912",
series = "IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings",
pages = "59--66",
booktitle = "IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings",

}

Chen, C, Ibekwe-SanJuan, F, SanJuan, E & Weaver, C 2006, Visual analysis of conflicting opinions. in IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings., 4035748, IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings, pp. 59-66, IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006, Baltimore, MD, United States, 06/10/31. https://doi.org/10.1109/VAST.2006.261431

Visual analysis of conflicting opinions. / Chen, Chaomei; Ibekwe-SanJuan, Fidelia; SanJuan, Eric; Weaver, Chris.

IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings. 2006. p. 59-66 4035748 (IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Visual analysis of conflicting opinions

AU - Chen, Chaomei

AU - Ibekwe-SanJuan, Fidelia

AU - SanJuan, Eric

AU - Weaver, Chris

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Understanding the nature and dynamics of conflicting opinions is a profound and challenging issue. In this paper we address several aspects of the issue through a study of more than 3,000 Amazon customer reviews of the controversial bestseller The Da Vinci Code, including 1,738 positive and 918 negative reviews. The study is motivated by critical questions such as: What are the differences between positive and negative reviews? What is the origin of a particular opinion? How do these opinions change over time? To what extent can differentiating features be identified from unstructured text? How accurately can these features predict the category of a review? We first analyze terminology variations in these reviews in terms of syntactic, semantic, and statistic associations identified by TermWatch and use term variation patterns to depict underlying topics. We then select the most predictive terms based on log likelihood tests and demonstrate that this small set of terms classifies over 70% of the conflicting reviews correctly. This feature selection process reduces the dimensionality of the feature space from more than 20,000 dimensions to a couple of hundreds. We utilize automatically generated decision trees to facilitate the understanding of conflicting opinions in terms of these highly predictive terms. This study also uses a number of visualization and modeling tools to identify not only what positive and negative reviews have in common, but also they differ and evolve over time.

AB - Understanding the nature and dynamics of conflicting opinions is a profound and challenging issue. In this paper we address several aspects of the issue through a study of more than 3,000 Amazon customer reviews of the controversial bestseller The Da Vinci Code, including 1,738 positive and 918 negative reviews. The study is motivated by critical questions such as: What are the differences between positive and negative reviews? What is the origin of a particular opinion? How do these opinions change over time? To what extent can differentiating features be identified from unstructured text? How accurately can these features predict the category of a review? We first analyze terminology variations in these reviews in terms of syntactic, semantic, and statistic associations identified by TermWatch and use term variation patterns to depict underlying topics. We then select the most predictive terms based on log likelihood tests and demonstrate that this small set of terms classifies over 70% of the conflicting reviews correctly. This feature selection process reduces the dimensionality of the feature space from more than 20,000 dimensions to a couple of hundreds. We utilize automatically generated decision trees to facilitate the understanding of conflicting opinions in terms of these highly predictive terms. This study also uses a number of visualization and modeling tools to identify not only what positive and negative reviews have in common, but also they differ and evolve over time.

UR - http://www.scopus.com/inward/record.url?scp=36349011694&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=36349011694&partnerID=8YFLogxK

U2 - 10.1109/VAST.2006.261431

DO - 10.1109/VAST.2006.261431

M3 - Conference contribution

AN - SCOPUS:36349011694

SN - 1424405912

SN - 9781424405916

T3 - IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings

SP - 59

EP - 66

BT - IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings

ER -

Chen C, Ibekwe-SanJuan F, SanJuan E, Weaver C. Visual analysis of conflicting opinions. In IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings. 2006. p. 59-66. 4035748. (IEEE Symposium on Visual Analytics Science and Technology 2006, VAST 2006 - Proceedings). https://doi.org/10.1109/VAST.2006.261431