People often rely on online reviews to make purchase decisions. The present work aimed to gain an understanding of a machine learning model's prediction mechanism by visualizing the effect of sentiments extracted from online hotel reviews with explainable AI (XAI) methodology. Study 1 used the extracted sentiments as features to predict the review ratings by five machine learning algorithms (knn, CART decision trees, support vector machines, random forests, gradient boosting machines) and identified random forests as best algorithm. Study 2 analyzed the random forests model by feature importance and revealed the sentiments joy, disgust, positive and negative as the most predictive features. Furthermore, the visualization of additive variable attributions and their prediction distribution showed correct prediction in direction and effect size for the 5-star rating but partially wrong direction and insufficient effect size for the 1-star rating. These prediction details were corroborated by a what-if analysis for the four top features. In conclusion, the prediction mechanism of a machine learning model can be uncovered by visualization of particular observations. Comparing instances of contrasting ground truth values can draw a differential picture of the prediction mechanism and inform decisions for model improvement.
|Title of host publication||Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, NLPIR 2020|
|Publisher||Association for Computing Machinery|
|Number of pages||6|
|Publication status||Published - 2020 Dec 18|
|Event||4th International Conference on Natural Language Processing and Information Retrieval, NLPIR 2020 - Virtual, Online, Korea, Republic of|
Duration: 2020 Dec 18 → 2020 Dec 20
|Name||ACM International Conference Proceeding Series|
|Conference||4th International Conference on Natural Language Processing and Information Retrieval, NLPIR 2020|
|Country/Territory||Korea, Republic of|
|Period||20/12/18 → 20/12/20|
Bibliographical noteFunding Information:
This research was supported by the Yonsei University Faculty Research Fund of 2019-22-0199.
© 2020 ACM.
All Science Journal Classification (ASJC) codes
- Human-Computer Interaction
- Computer Vision and Pattern Recognition
- Computer Networks and Communications