Purpose: The purpose of this paper is to introduce a new multiple imputation method that can effectively manage missing values in online review data, thereby allowing the online review analysis to yield valid results by using all available data. Design/methodology/approach: This study develops a missing data method based on the multivariate imputation chained equation to generate imputed values for online reviews. Sentiment analysis is used to incorporate customers’ textual opinions as the auxiliary information in the imputation procedures. To check the validity of the proposed imputation method, the authors apply this method to missing values of sub-ratings on hotel attributes in both the simulated and real Honolulu hotel review data sets. The estimation results are compared to those of different missing data techniques, namely, listwise deletion and conventional multiple imputation which does not consider text reviews. Findings: The findings from the simulation analysis show that the imputation method of the authors produces more efficient and less biased estimates compared to the other two missing data techniques when text reviews are possibly associated with the rating scores and response mechanism. When applying the imputation method to the real hotel review data, the findings show that the text sentiment-based propensity score can effectively explain the missingness of sub-ratings on hotel attributes, and the imputation method considering those propensity scores has better estimation results than the other techniques as in the simulation analysis. Originality/value: This study extends multiple imputation to online data considering its spontaneous and unstructured nature. This new method helps make the fuller use of the observed online data while avoiding potential missing problems.
|Number of pages||18|
|Journal||International Journal of Contemporary Hospitality Management|
|Publication status||Published - 2018 Nov 20|
Bibliographical notePublisher Copyright:
© 2018, Emerald Publishing Limited.
All Science Journal Classification (ASJC) codes
- Tourism, Leisure and Hospitality Management