With various structured data, such as the company size, loan balance, and savings accounts, the voice of customer (VOC), which is text data containing contact history and counseling details was analyzed in this study. To analyze unstructured data, the term frequency-inverse document frequency (TF-IDF) analysis, semantic network analysis, sentiment analysis, and a convolutional neural network (CNN) were implemented. A performance comparison of the models revealed that the predictive model using the CNN provided the best performance with regard to predictive power, followed by the model using the TF-IDF, and then the model using semantic network analysis. In particular, a character-level CNN and a word-level CNN were developed separately, and the character-level CNN exhibited better performance, according to an analysis for the Korean language. Moreover, a systematic selection model for optimal text mining techniques was proposed, suggesting which analytical technique is appropriate for analyzing text data depending on the context. This study also provides evidence that the results of previous studies, indicating that individual customers leave when their loyalty and switching cost are low, are also applicable to corporate customers and suggests that VOC data indicating customers' needs are very effective for predicting their behavior.
|Number of pages||19|
|Journal||KSII Transactions on Internet and Information Systems|
|Publication status||Published - 2020 Dec 31|
Bibliographical noteFunding Information:
This paper is based on the doctoral dissertation of the first author (Hoon Jung, 2020, Exploring the Methods of Transforming Unstructured Data into Structured Data and Their Impact on a Churn Prediction Model, The graduate School of Yonsei University).
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Networks and Communications