An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels

호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법

  • Moon, Hyun Sil (School of Business & AI Management Research Center, KyungHee University) ;
  • Sung, David (School of Business & AI Management Research Center, KyungHee University) ;
  • Kim, Jae Kyeong (School of Business & AI Management Research Center, KyungHee University)
  • 문현실 (경희대학교 경영학부 & AI경영연구센터) ;
  • 성다윗 (경희대학교 경영학부 & AI경영연구센터) ;
  • 김재경 (경희대학교 경영학부 & AI경영연구센터)
  • Received : 2019.01.08
  • Accepted : 2019.03.11
  • Published : 2019.03.31


Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.

정보 기술의 발전으로 온라인에서 활용 가능한 데이터의 양이 급속히 증대되고 있다. 이러한 빅데이터 시대에 많은 연구들이 통찰력을 발견하고 데이터의 효과를 입증하기 위해 노력하고 있다. 특히 관광 산업의 경우 정보에 민감한 사업으로 소셜 미디어의 영향력이 높고 소셜 미디어의 상품 후기에 소비자들이 영향을 많이 받아 많은 기업과 연구자들이 소셜 미디어를 분석하여 새로운 서비스 및 통찰력을 얻고자 시도하였다. 하지만 소셜 미디어의 후기는 텍스트로 이루어진 대표적인 비정형 데이터로 적절한 처리를 하지 않으면 분석에 활용할 수 없다. 또한 후기 데이터의 양이 방대함에 따라 사람이 직접 분석하기도 어려운 실정이다. 따라서, 본 연구에서는 이러한 소셜미디어 상의 온라인 후기로부터 직접 호텔의 서비스 품질 향상을 위한 통찰력을 추출할 수 있는 분석 방법을 제시하고자 한다. 이를 위해 본 연구에서는 먼저 후기 데이터에 포함되어 있는 주제어를 추출하는 토픽 마이닝 기법을 적용하였다. 토픽 마이닝은 대용량의 문서 집합으로부터 문서를 대표하는 단어 집합을 추출하는 기법을 의미하며 본 연구에서는 다양한 연구에서 활용되고 있는 LDA모형을 사용하여 토픽 마이닝을 수행하였다. 하지만, 토픽 마이닝 자체만으로는 주제어와 평점 사이의 관계를 도출할 수 없어 서비스 품질 향상을 위한 통찰력을 발견하기 어렵다. 그에 따라 본 연구에서는 토픽 마이닝의 결과값을 기반으로 의사결정나무 모형을 사용하여 주제어와 평점 사이의 관계를 도출하였다. 이러한 방법론의 유용성을 평가하기 위해 홍콩에 있는 4개 호텔의 온라인 후기를 수집하고 제안한 방법론의 분석 결과를 해석하는 실험을 진행하였다. 실험 결과 긍정 후기를 통해 각 호텔이 유지해야할 서비스 영역을 발견할 수 있었으며 부정 후기를 통해 개선해야할 서비스 영역을 도출할 수 있었다. 따라서, 본 연구에서 제안한 방법론을 사용하여 방대한 양의 후기 데이터로부터 서비스 개선 및 유지 영역을 발견할 수 있으리라 기대된다.


JJSHBB_2019_v25n1_21_f0001.png 이미지

The graphical model for the Latent Dirichlet Allocation (Blei & Lafferty, 2009)

JJSHBB_2019_v25n1_21_f0002.png 이미지

Overall Procedure

JJSHBB_2019_v25n1_21_f0003.png 이미지

An Example of Review Data from TripAdvisor

JJSHBB_2019_v25n1_21_f0005.png 이미지

CART Results for Positive Reviews

JJSHBB_2019_v25n1_21_f0006.png 이미지

CART Results for Negative Reviews

An Illustrative Example of the LDA analysis

JJSHBB_2019_v25n1_21_t0002.png 이미지

An Illustrative Example of Each Topic's Proportions for Each Review

JJSHBB_2019_v25n1_21_t0003.png 이미지

Data Description for Experiments

JJSHBB_2019_v25n1_21_t0004.png 이미지

An Example of Topic Extraction

JJSHBB_2019_v25n1_21_t0005.png 이미지

The Performance of Models for Positive Reviews

JJSHBB_2019_v25n1_21_t0006.png 이미지

Main Rules of Positive Reviews for Each Hotel

JJSHBB_2019_v25n1_21_t0007.png 이미지

The Performance of Models for Negative Reviews

JJSHBB_2019_v25n1_21_t0008.png 이미지

Main Rules of Negative Reviews for Each Hotel

JJSHBB_2019_v25n1_21_t0009.png 이미지

The Strengths and Weaknesses of Four Hotels

JJSHBB_2019_v25n1_21_t0010.png 이미지


  1. Archak, N., A. Ghose, and P.G. Ipeirotis, "Deriving the pricing power of product features by mining consumer reviews," Management science, Vol.57, No.8(2011), 1485-1509.
  2. Askitas, N., and K.F. Zimmermann, "Google econometrics and unemployment forecasting," Applied Economics Quarterly, Vol.55, No.2(2009), 107-120.
  3. Baars, H., and H.G. Kemper, "Management support with structured and unstructured data -an integrated business intelligence framework," Information Systems Management, Vol.25, No.2(2008), 132-148.
  4. Blei, D. M., and J. Lafferty, Text mining: Classification, clustering, and applications, Chapman & Hall/CRC, 2009.
  5. Blei, D. M., A.Y. Ng, and M.I. Jordan, "Latent dirichlet allocation," Journal of machine Learning research, Vol.3, No.1(2003), 993-1022.
  6. Breiman, L, Classification and regression trees, Routledge, New York, 1984.
  7. Cambria, E., B. Schuller, Y. Xia, and C. Havasi, "New avenues in opinion mining and sentiment analysis," IEEE Intelligent Systems, Vol.28, No.2(2013), 15-21.
  8. Chae, S.H., J.I. Lim, and J. Kang, "A Comparative Analysis of Social Commerce and Open Market Using User Reviews in Korean Mobile Commerce," Journal of Intelligent Information Systems, Vol.21, No.4(2015), 53-77.
  9. Chen, Y., and J. Xie, "Online consumer review: Word-of-mouth as a new element of marketing communication mix," Management science, Vol.54, No.3(2008), 477-491.
  10. Chevalier, J. A., and D. Mayzlin, "The effect of word of mouth on sales: Online book reviews," Journal of marketing research, Vol.43, No.3(2006), 345-354.
  11. Choi, H., and H. Varian, "Predicting the present with Google Trends," Economic Record, Vol.88(2012), 2-9.
  12. Das, S. R., and M.Y. Chen, "Yahoo! for Amazon: Sentiment extraction from small talk on the web," Management science, Vol.53, No.9 (2007), 1375-1388.
  13. Dellarocas, C., "Strategic manipulation of internet opinion forums: Implications for consumers and firms," Management science, Vol.52, No.10(2006), 1577-1593.
  14. Dellarocas, C., X. Zhang, and N.F. Awad, "Exploring the value of online product reviews in forecasting sales: The case of motion pictures," Journal of Interactive marketing, Vol.21, No.4(2007), 23-45.
  15. Duan, W., B. Gu, and A.B. Whinston, "Do online reviews matter?-An empirical investigation of panel data," Decision support systems, Vol.45, No.4(2008), 1007-1016.
  16. Fang, B., Q. Ye, D. Kucukusta, and R. Law, "Analysis of the perceived value of online tourism reviews: Influence of readability and reviewer characteristics," Tourism Management, Vol.52(2016), 498-506.
  17. Forman, C., A. Ghose, and B. Wiesenfeld, "Examining the relationship between reviews and sales: The role of reviewer identity disclosure in electronic markets," Information Systems Research, Vol.19, No.3(2008), 291-313.
  18. Gantz, J., and D. Reinsel, "Extracting value from chaos," IDC review, Vol.1142(2011), 1-12.
  19. Goel, S., J.M. Hofman, S. Lahaie, D.M. Pennock, and D.J. Watts, "Predicting consumer behavior with Web search," Proceedings of the National Academy of Sciences(PNAS) (2010).
  20. Grewal, R., T.W. Cline, and A. Davies, "Early-entrant advantage, word‐of‐mouth communication, brand similarity, and the consumer decision‐making process," Journal of Consumer Psychology, Vol.13, No.3 (2003), 187-197.
  21. Griffiths, T. L., M. Steyvers, D.M. Blei, and J.B. Tenenbaum, "Integrating topics and syntax," In Advances in neural information processing systems (2005), 537-544.
  22. Hagenau, M., M. Liebmann, and D. Neumann, "Automated news reading: Stock price prediction based on financial news using context-capturing features," Decision Support Systems, Vol.55, No.3(2013), 685-697.
  23. Herr, P. M., F.R. Kardes, and J. Kim, "Effects of word-of-mouth and product-attribute information on persuasion: An accessibility-diagnosticity perspective," Journal of consumer research, Vol.17, No.4(1991), 454-462.
  24. Huang, J., and C.X. Ling, "Using AUC and accuracy in evaluating learning algorithms," IEEE Transactions on knowledge and Data Engineering, Vol.17, No.3(2005), 299-310.
  25. Kusumasondjaja, S., T. Shanka, and C. Marchegiani, "Credibility of online reviews and initial trust: The roles of reviewer's identity and review valence," Journal of Vacation Marketing, Vol.18, No.3(2012), 185-195.
  26. Lee, J., "How eWOM reduces uncertainties in decision-making process: using the concept of entropy in information theory," The Journal of Society for e-Business Studies, Vol.16, No.4(2011), 241-256.
  27. Lee, M. and H.J. Kim, "Construction of Event Networks from Large News Data Using Text Mining Techniques," Journal of Intelligent Information Systems, Vol.24, No.1(2018), 183-203.
  28. Liu, Y., "Word of mouth for movies: Its dynamics and impact on box office revenue," Journal of marketing, Vol.70, No.3(2006), 74-89.
  29. Liu, Z., and S. Park, "What makes a useful online review? Implication for travel product websites," Tourism Management, Vol.47(2015), 140-151.
  30. Loh, W. Y., "Classification and regression trees," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol.1, No.1(2011), 14-23.
  31. Mudambi, S. M., and D. Schuff, "Research note: What makes a helpful online review? A study of customer reviews on Amazon. com.," MIS quarterly, Vol.34, No.1(2010), 185-200.
  32. Ogut, H., and B.K. Onur Tas, "The influence of internet customer reviews on the online sales and prices in hotel industry," The Service Industries Journal, Vol.32, No.2(2012), 197-214.
  33. Pan, B., T. MacLaurin, and J.C. Crotts, "Travel blogs and the implications for destination marketing," Journal of Travel Research, Vol.46, No.1(2007), 35-45.
  34. Park, S.M., and B.W. On, "Latent topics-based product reputation mining," Journal of Intelligent Information Systems, Vol.23, No.2 (2017), 39-70.
  35. Salehan, M., and D.J. Kim, "Predicting the performance of online consumer reviews: A sentiment mining approach to big data analytics," Decision Support Systems, Vol.81(2016), 30-40.
  36. Sparks, B. A., and V. Browning, "The impact of online reviews on hotel booking intentions and perception of trust," Tourism management, Vol.32, No.6(2011), 1310-1323.
  37. Sparks, B. A., H.E. Perkins, and R. Buckley, "Online travel reviews as persuasive communication: The effects of content type, source, and certification logos on consumer behavior," Tourism Management, Vol.39 (2013), 1-9.
  38. Ur-Rahman, N., and J.A. Harding, "Textual data mining for industrial knowledge management and text classification: A business oriented approach," Expert Systems with Applications, Vol.39, No.5(2012), 4729-4739.
  39. Vermeulen, I. E. and D. Seegers, "Tried and tested: The impact of online hotel reviews on consumer consideration," Tourism management, Vol.30, No.1(2009), 123-127.
  40. Werthner, H., and S. Klein, Information technology and tourism: a challenging relationship. Springer-Verlag Wien, 1999.
  41. Xiang, Z., and U. Gretzel, "Role of social media in online travel information search," Tourism management, Vol.31, No.2(2010), 179-188.
  42. Xie, K. L., Z. Zhang, and Z. Zhang, "The business value of online consumer reviews and management response to hotel performance," International Journal of Hospitality Management, Vol.43(2014), 1-12.
  43. Ye, Q., R. Law, G. Gu, and W. Chen, "The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings," Computers in Human behavior, Vol.27, No.2(2011), 634-639.
  44. Zhang, X., and C. Dellarocas, "The lord of the ratings: is a movie's fate is influenced by reviews?," International Conference on Information Systems 2006 proceedings, (2006), 117.
  45. Zhang, Z., Q. Ye, R. Law, and Y. Li, "The impact of e-word-of-mouth on the online popularity of restaurants: A comparison of consumer reviews and editor reviews," International Journal of Hospitality Management, Vol.29, No.4(2010), 694-700.
  46. Zhao, F., Y. Zhu, H. Jin, and L.T. Yang, "A personalized hashtag recommendation approach using LDA-based topic model in microblog environment," Future Generation Computer Systems, Vol.65((2016), 196-206.
  47. Zikopoulos, P., and C. Eaton, Understanding big data: Analytics for enterprise class hadoop and streaming data, McGraw-Hill Osborne Media, 2011.
(34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
Copyright (C) KISTI. All Rights Reserved.