DOI QR코드

DOI QR Code

Applying Academic Theory with Text Mining to Offer Business Insight: Illustration of Evaluating Hotel Service Quality

  • Choong C. Lee (Graduate School of Information, Yonsei University) ;
  • Kun Kim (Graduate School of Information, Yonsei University) ;
  • Haejung Yun (College of Science & Industry Convergence, Ewha Womans University)
  • Received : 2019.05.13
  • Accepted : 2019.07.31
  • Published : 2019.12.31

Abstract

Now is the time for IS scholars to demonstrate the added value of academic theory through its integration with text mining, clearly outline how to implement this for text mining experts outside of the academic field, and move towards establishing this integration as a standard practice. Therefore, in this study we develop a systematic theory-based text-mining framework (TTMF), and illustrate the use and benefits of TTMF by conducting a text-mining project in an actual business case evaluating and improving hotel service quality using a large volume of actual user-generated reviews. A total of 61,304 sentences extracted from actual customer reviews were successfully allocated to SERVQUAL dimensions, and the pragmatic validity of our model was tested by the OLS regression analysis results between the sentiment scores of each SERVQUAL dimension and customer satisfaction (star rates), and showed significant relationships. As a post-hoc analysis, the results of the co-occurrence analysis to define the root causes of positive and negative service quality perceptions and provide action plans to implement improvements were reported.

Keywords

Acknowledgement

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2016S1A5A2A03927883).

References

  1. Abbasi, A., Sarker, S., and Chiang, R. H. (2016). Big data research in information systems: Toward an inclusive research agenda. Journal of the Association for Information Systems, 17(2), 1-32.  https://doi.org/10.17705/1jais.00419
  2. Agarwal, R., and Dhar, V. (2014). Editorial-big data, data science, and analytics: The opportunity and challenge for IS research. Information Systems Research, 25(3), 443-448.  https://doi.org/10.1287/isre.2014.0546
  3. Baek, H., Ahn, J., and Choi, Y. (2012). Helpfulness of online consumer reviews: Readers' objectives and review cues. International Journal of Electronic Commerce, 17(2), 99-126.  https://doi.org/10.2753/JEC1086-4415170204
  4. Berente, N., Seidel, S., and Safadi, H. (2018). Research commentary-data-driven computationally intensive theory development. Information Systems Research. Ariticles in Advance, Retrieved form https://pubsonline.informs.org/doi/10.1287/isre.2018.0774 
  5. Bizer, C., Cyganiak, R., and Heath, T. (2007). How to publish linked data on the web. Retrieved from http://wifo5-03.informatik.uni-mannheim.de/bizer/HowtoPublishLinkedData.htm 
  6. Boell, S. K., and Cecez-Kecmanovic, D. (2015). Debating systematic literature reviews (SLR) and their ramifications for IS: A rejoinder to Mike Chiasson, Briony Oates, Ulrike Schultze, and Richard Watson. Journal of Information Technology, 30(2), 188-193.  https://doi.org/10.1057/jit.2015.15
  7. Breuker, D., Matzner, M., Delfmann, P., and Becker, J. (2016). Comprehensible predictive models for business processes. MIS Quarterly, 40(4), 1009-1034.  https://doi.org/10.25300/MISQ/2016/40.4.10
  8. Calheiros, A. C., Moro, S., and Rita, P. (2017). Sentiment classification of consumer-generated online reviews using topic modeling. Journal of Hospitality Marketing & Management, 26(7), 675-693.  https://doi.org/10.1080/19368623.2017.1310075
  9. Callon, M., Courtial, J. P., Turner, W. A., and Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Information (International Social Science Council), 22(2), 191-235.  https://doi.org/10.1177/053901883022002003
  10. Cao, Q., Duan, W., and Gan, Q. (2011). Exploring determinants of voting for the "helpfulness" of online user reviews: A text mining approach. Decision Support Systems, 50(2), 511-521.  https://doi.org/10.1016/j.dss.2010.11.009
  11. Chang, C. W., Lin, C. T., and Wang, L. Q (2009). Mining the text information to optimizing the customer relationship management. Expert Systems with applications, 36(2), 1433-1443.  https://doi.org/10.1016/j.eswa.2007.11.027
  12. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R. (2000). CRISP-DM 1.0 Step-by-step data mining guide. 
  13. Chau, M., and Xu, J (2012). Business Intelligence in Blogs: Understanding consumer interactions and communities. MIS Quarterly, 36(4), 1189-1216.  https://doi.org/10.2307/41703504
  14. Chen, H., Chiang, R. H., and Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS Quarterly, 36(4), 1165-1188.  https://doi.org/10.2307/41703503
  15. Chen, R., Zheng, Y., Xu, W., Liu, M., and Wang, J. (2018). Secondhand seller reputation in online markets: A text analytics framework. Decision Support Systems, 108, 96-106.  https://doi.org/10.1016/j.dss.2018.02.008
  16. Cho W., Rho, S., Yun, J. A., and Park, J. (2011). A new approach to automatic keyword generation using inverse vector space model. Asia Pacific Journal of Information Systems, 21(1), 103-122. 
  17. Corley, K. G., and Gioia, D. A. (2011). Building theory about theory building: What constitutes a theoretical contribution? Academy of Management Review, 36(1), 12-32.  https://doi.org/10.5465/amr.2009.0486
  18. Cortes, C., and Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.  https://doi.org/10.1007/BF00994018
  19. Cronin Jr, J. J., and Taylor, S. A. (1992). Measuring service quality: A reexamination and extension. The Journal of Marketing, 56(3), 55-68.  https://doi.org/10.1177/002224299205600304
  20. Debortoli, S., Muller, O., Junglas, I. A., and vom Brocke, J. (2016). Text mining for information systems researchers: An annotated topic modeling tutorial. Communications of the Association for Information Systems, 39(7). 
  21. Dong, W., Liao, S., and Zhang, Z. (2018). Leveraging financial social media data for corporate fraud detection. Journal of Management Information Systems, 35(2), 461-487.  https://doi.org/10.1080/07421222.2018.1451954
  22. Duan, W., Cao, Q., Yu, Y., and Levy, S. (2013). Mining online user-generated content: Using sentiment analysis technique to study hotel service quality. System Sciences (HICSS), 2013 46th Hawaii International Conference on: IEEE, 3119-3128. 
  23. Duan, W., Yu, Y., Cao, Q., and Levy, S. (2016). Exploring the impact of social media on hotel service performance: A sentimental analysis approach. Cornell Hospitality Quarterly, 57(3), 282-296.  https://doi.org/10.1177/1938965515620483
  24. Evangelopoulos, N., Zhang, X., and Prybutok, V. R. (2012). Latent semantic analysis: Five methodological recommendations. European Journal of Information Systems, 21(1), 70-86.  https://doi.org/10.1057/ejis.2010.61
  25. Feldman, R., and Sanger, J. (2007). The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge university press. 
  26. Gao, B., Hu, N., and Bose, I. (2017). Follow the herd or be myself? An analysis of consistency in behavior of reviewers and helpfulness of their reviews. Decision Support Systems, 9, 1-11.  https://doi.org/10.1016/j.dss.2016.11.005
  27. Godnov, U., and Redek, T. (2016). Application of text mining in tourism: case of Croatia. Annals of Tourism Research, 58, 162-166.  https://doi.org/10.1016/j.annals.2016.02.005
  28. Gorla, N. (2011). An assessment of information systems service quality using SERVQUAL+. ACM SIGMIS Database: the DATABASE for Advances in Information Systems, 42(3), 46-70.  https://doi.org/10.1145/2038056.2038060
  29. Gupta, M., and George, J. F. (2016). Toward the development of a big data analytics capability. Information and Management, 53(8), 1049-1064.  https://doi.org/10.1016/j.im.2016.07.004
  30. He, W. (2013). Examining students' online interaction in a live video streaming environment using data mining and text mining. Computers in Human Behavior, 29(1), 90-102.  https://doi.org/10.1016/j.chb.2012.07.020
  31. Hemmington, N., Kim, P. B., and Wang, C. (2018). Benchmarking hotel service quality using two-dimensional importance-performance benchmark vectors (IPBV). Journal of Service Theory and Practice, 28(1), 2-25.  https://doi.org/10.1108/JSTP-06-2017-0103
  32. Hong, T., and Park, J. (2011). Feature selection for multi-class support vector machines using an impurity measure of classification trees: an application to the credit rating of S&P 500 companies. Asia Pacific Journal of Information Systems, 21(2), 43-58. 
  33. Hsieh, L. F., Lin, L. H., and Lin, Y. Y. (2008). A service quality measurement architecture for hot spring hotels in Taiwan. Tourism Management, 29(3), 429-438.  https://doi.org/10.1016/j.tourman.2007.05.009
  34. Jabr, W., Mookerjee, R., Tan, Y., and Mookerjee, V. (2014). Leveraging philanthropic behavior for customer support: The case of user support forums. MIS Quarterly, 38(1), 187-208.  https://doi.org/10.25300/MISQ/2014/38.1.09
  35. Jakopovic, H., and Preradovic, N. M. (2013). Evaluation in public relations-sentiment and social media analysis of Croatia Airlines. 7th European Computing Conference (ECC'13). 
  36. Jiao, J., Zhang, L., Pokharel, S., and He, Z. (2007). Identifying generic routings for product families based on text mining and tree matching. Decision Support Systems, 43(3), 866-883.  https://doi.org/10.1016/j.dss.2007.01.001
  37. Junker, M., Hoch, R., and Dengel, A. (1999). On the evaluation of document analysis components by recall, precision, and accuracy. Document Analysis and Recognition, 1999. ICDAR'99. Proceedings of the Fifth International Conference on: IEEE, 713-716. 
  38. Kang, D., and Park, Y. (2014). Review-based measurement of customer satisfaction in mobile service: Sentiment analysis and vikor approach. Expert Systems with Applications, 41(4), 1041-1050.  https://doi.org/10.1016/j.eswa.2013.07.101
  39. Keith, N. K., and Simmers, C. S. (2013). Measuring hotel service quality perceptions: The disparity between comment cards and lodgserv. Academy of Marketing Studies Journal, 17(2), 119-148. 
  40. Kettinger, W. J., and Lee, C. C. (1994). Perceived service quality and user satisfaction with the information services function. Decision Sciences, 25(5/6), 737-766. 
  41. Kim, K., and Ahn, H. (2010). Customer level classification model using ordinal multiclass support vector machines. Asia Pacific Journal of Information Systems, 20(2), 23-37.  https://doi.org/10.5859/KAIS.2011.20.4.23
  42. Kim, K., Park, O.J., Yun, S., and Yun, H. (2017). What makes tourists feel negatively about tourism destinations? Application of hybrid text mining methodology to smart destination management. Technological Forecasting and Social Change, 123, 362-369.  https://doi.org/10.1016/j.techfore.2017.01.001
  43. Kim, T., Jung, W., and Lee, S. T. (2014). The analysis on the relationship between firms' exposures to SNS and stock prices in Korea. Asia Pacific Journal of Information Systems, 24(2), 233-253.  https://doi.org/10.14329/apjis.2014.24.2.233
  44. Kotu, V., and Deshpande, B. (2014). Predictive analytics and data mining: Concepts and practice with rapidminer. Morgan Kaufmann. 
  45. Lee, J. Y., Kim, H., and Kim, P. J. (2010a). Domain analysis with text mining: analysis of digital library research trends using profiling methods. Journal of Information Science, 36(2), 144-161.  https://doi.org/10.1177/0165551509353251
  46. Lee, S., Baker, J., Song, J., and Wetherbe, J. C. (2010b). An empirical comparison of four text mining methods. System Sciences (HICSS). 2010 43rd Hawaii International Conference on: IEEE, 1-10. 
  47. Li, N., and Wu, D. D. (2010). Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decision Support Systems, 48(2), 354-368.  https://doi.org/10.1016/j.dss.2009.09.003
  48. Li, W., Chen, H., and Nunamaker Jr, J. F. (2016). Identifying and profiling key sellers in cyber carding community: AZSecure text mining system. Journal of Management Information Systems, 33(4), 1059-1086.  https://doi.org/10.1080/07421222.2016.1267528
  49. Liang, N., Biros, D. P., and Luse, A. (2016). An empirical validation of malicious insider characteristics. Journal of Management Information Systems, 33(2), 361-392.  https://doi.org/10.1080/07421222.2016.1205925
  50. Lim, C., Kim, M. J., Kim, K. H., Kim, K. J., and Maglio, P. P. (2018). Using data to advance service: Managerial issues and theoretical implications from action research. Journal of Service Theory and Practice, 28(1), 99-128.  https://doi.org/10.1108/JSTP-08-2016-0141
  51. Liu, G. Y., Hu, J. M., and Wang, H. L. (2011). A co-word analysis of digital library field in China. Scientometrics, 91(1), 203-217.  https://doi.org/10.1007/s11192-011-0586-4
  52. Liu, X., Yu, S., Janssens, F., Glanzel, W., Moreau, Y., and De Moor, B. (2010). Weighted Hybrid clustering by combining text mining and bibliometrics on a large scale journal database. Journal of the American Society for Information Science and Technology, 61(6), 1105-1119.  https://doi.org/10.1002/asi.21312
  53. Liu, Y., Navathe, S. B., Civera, J., Dasigi, V., Ram, A., Ciliax, B. J., and Dingledine, R. (2005). Text mining biomedical literature for discovering gene-to-gene relationships: A comparative study of algorithms. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2(1), 62-76.  https://doi.org/10.1109/TCBB.2005.14
  54. Lo, S. (2008). Web service quality control based on text mining using support vector machine. Expert Systems with Applications, 34(1), 603-610.  https://doi.org/10.1016/j.eswa.2006.09.026
  55. Lu, Y., Luo, X., Polgar, M., and Cao, Y. (2010). Social network analysis of a criminal hacker community. Journal of Computer Information Systems, 51(2), 31-41. 
  56. Luo, X. M., Gu, B., Zhang, J., and Phang, C. W. (2017). Expert blogs and consumer perceptions of competing brands. MIS Quarterly, 41(2), 371-396.  https://doi.org/10.25300/MISQ/2017/41.2.03
  57. Mai, F., Shan, Z., Bai, Q., Wang, X., and Chiang, R. H. (2018). How does social media impact Bitcoin value? A test of the silent majority hypothesis. Journal of Management Information Systems, 35(1), 19-52.  https://doi.org/10.1080/07421222.2018.1440774
  58. Meire, M., Ballings, M., and Van den Poel, D. (2016). The added value of auxiliary data in sentiment analysis of Facebook posts. Decision Support Systems, 89, 98-112.  https://doi.org/10.1016/j.dss.2016.06.013
  59. Montoyo, A., MartiNez-Barco, P., and Balahur, A. (2012). Subjectivity and sentiment analysis: An overview of the current state of the area and envisaged developments. Decision Support Systems, 53(4), 675-679.  https://doi.org/10.1016/j.dss.2012.05.022
  60. Moreno, A., and Terwiesch, C. (2014). Doing business with strangers: Reputation in online service marketplaces. Information Systems Research, 25(4), 865-886.  https://doi.org/10.1287/isre.2014.0549
  61. Nadkarni, A., and Vesset, D. (2015). Worldwide big data technology and services forecast, 2015-2019. International Data Corporation. IDC, 259532. 
  62. Nasukawa, T., and Nagano, T. (2001). Text analysis and knowledge mining system. IBM Systems Journal, 40(4), 967-984.  https://doi.org/10.1147/sj.404.0967
  63. Niu, R. H., and Fan, Y. (2018). An exploratory study of online review management in hospitality services. Journal of Service Theory and Practice, 28(1), 79-98.  https://doi.org/10.1108/JSTP-09-2016-0158
  64. Parasuraman, A., Zeithaml, V. A., and Berry, L. L. (1988). Servqual: A multiple-item scale for measuring consumer perceptions of service quality. Journal of Retailing, 64(1), 12-40. 
  65. Park, Y., and Lee, S. (2011). How to design and utilize online customer center to support new product concept generation. Expert Systems with Applications, 38(8), 10638-10647.  https://doi.org/10.1016/j.eswa.2011.02.125
  66. Patil, Y., and Patil, S. (2016). Review of web crawlers with specification and working. International Journal of Advanced Research Computer and Communication Engineering, 5(1), 220-223. 
  67. Rai, A. (2016). Editor's comments: Synergies between big data and theory. MIS Quarterly, 40(2), iii-ix. 
  68. Ranaweera, C., and Sigala, M. (2015). From service quality to service theory and practice. Journal of Service Theory and Practice, 25(1), 2-9.  https://doi.org/10.1108/JSTP-11-2014-0248
  69. Rossetti, M., Stella, F., and Zanker, M. (2016). Analyzing user reviews in tourism with topic models. Information Technology & Tourism, 16(1), 5-21.  https://doi.org/10.1007/s40558-015-0035-y
  70. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47.  https://doi.org/10.1145/505282.505283
  71. Shi, Z., Lee, G. M., and Whinston, A. B. (2016). Toward a better measure of business proximity: Topic modeling for industry intelligence. MIS Quarterly, 40(4), 1035-1056.  https://doi.org/10.25300/MISQ/2016/40.4.11
  72. Singh, P. V., Sahoo, N., and Mukhopadhyay, T. (2014). How to attract and retain readers in enterprise blogging? Information Systems Research, 25(1), 35-52.  https://doi.org/10.1287/isre.2013.0509
  73. Singh, R., and Woo, J. (2019). Applications of machine learning models on yelp data. Asia Pacific Journal of Information Systems, 29(1), 35-49.  https://doi.org/10.14329/apjis.2019.29.1.35
  74. Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for information Science, 24(4), 265-269.  https://doi.org/10.1002/asi.4630240406
  75. Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning, C. D., Ng, A. Y., and Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP): Citeseer, 1631-1642. 
  76. Suh, J. H., Park, C. H., and Jeon, S. H. (2010). Applying text and data mining techniques to forecasting the trend of petitions filed to e-people. Expert Systems with Applications, 37(10), 7255-7268.  https://doi.org/10.1016/j.eswa.2010.04.002
  77. Sutton, R. I., and Staw, B. M. (1995). What theory is not. Administrative Science Quarterly, 40(3), 371-384.  https://doi.org/10.2307/2393788
  78. Tan, S. (2006). An effective refinement strategy for KNN text classifier. Expert Systems with Applications, 30(2), 290-298.  https://doi.org/10.1016/j.eswa.2005.07.019
  79. Thorleuchter, D., and Van Den Poel, D. (2012). Predicting e-commerce company success by mining the text of its publicly-accessible website. Expert Systems with Applications, 39(17), 13026-13034.  https://doi.org/10.1016/j.eswa.2012.05.096
  80. Tong, S., and Koller, D. (2001). Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2, 45-66. 
  81. Tseng, Y. H., Lin, C. J., and Lin, Y. I. (2007). Text mining techniques for patent analysis. Information Processing & Management, 43(5), 1216-1247.  https://doi.org/10.1016/j.ipm.2006.11.011
  82. Uramoto, N., Matsuzawa, H., Nagano, T., Murakami, A., Takeuchi, H., and Takeda, K. (2004). A Text-mining system for knowledge discovery from biomedical documents. IBM Systems Journal, 43(3), 516-533.  https://doi.org/10.1147/sj.433.0516
  83. Van de Ven, A. H. (2007). Engaged scholarship: A guide for organizational and social research. Oxford University Press on Demand. 
  84. Van Rijsbergen, C. J. (1977). A theoretical basis for the use of co-occurrence data in information retrieval. Journal of Documentation, 33(2), 106-119.  https://doi.org/10.1108/eb026637
  85. Wang, T., Kannan, K. N., and Ulmer, J. R. (2013). The association between the disclosure and the realization of information security risk factors. Information Systems Research, 24(2), 201-218.  https://doi.org/10.1287/isre.1120.0437
  86. Wang, Y., Aguirre-Urreta, M., and Song, J. (2016). Investigating the value of information in mobile commerce: A text mining approach. Asia Pacific Journal of Information Systems, 26(4), 577-592.  https://doi.org/10.14329/apjis.2016.26.4.577
  87. Wang, Y., and Xu, W. (2018). Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decision Support Systems, 105, 87-95.  https://doi.org/10.1016/j.dss.2017.11.001
  88. Wang, Z., Zhao, H., and Wang, Y. (2015). Social networks in marketing research 2001-2014: A co-word analysis. Scientometrics, 105(1), 65-82.  https://doi.org/10.1007/s11192-015-1672-9
  89. Weber, R. (2003). Editor's Comment: Theoretically Speaking. MIS Quarterly, 27(3), 3-12.  https://doi.org/10.2307/30036536
  90. Webster, J., and Watson, R. T. (2002). Analyzing the Past to Prepare for the Future: Writing a Literature Review. MIS Quarterly, 26(2), 13-23. 
  91. Winkler, M., Abrahams, A. S., Gruss, R., and Ehsani, J. P. (2016). Toy safety surveillance from online reviews. Decision Support Systems, 90, 23-32.  https://doi.org/10.1016/j.dss.2016.06.016
  92. Yan, B. N., Lee, T. S., and Lee, T. P. (2015). Mapping the intellectual structure of the Internet of Things (IoT) field (2000-2014): A co-word analysis. Scientometrics, 105(2), 1285-1300.  https://doi.org/10.1007/s11192-015-1740-1
  93. Yee Liau, B., and Pei Tan, P. (2014). Gaining customer knowledge in low cost airlines through text mining. Industrial Management & Data Systems, 114(9), 1344-1359.  https://doi.org/10.1108/IMDS-07-2014-0225
  94. Yilmaz, I. (2009). Measurement of service quality in the hotel industry. Anatolia, 20(2), 375-386.  https://doi.org/10.1080/13032917.2009.10518915
  95. Zeithaml, V. A., Berry, L. L., and Parasuraman, A. (1996). The behavioral consequences of service quality. The Journal of Marketing, 60(2), 31-46.  https://doi.org/10.1177/002224299606000203
  96. Zhang, K., Bhattacharyya, S., and Ram, S. (2016). Large-scale network analysis for online social brand advertising. MIS Quarterly, 40(4), 849-868.  https://doi.org/10.25300/MISQ/2016/40.4.03
  97. Zhou, S., Qiao, Z., Du, Q., Wang, G. A., Fan, W., and Yan, X. (2018). Measuring customer agility from online reviews using big data text analytics. Journal of Management Information Systems, 35(2), 510-539. https://doi.org/10.1080/07421222.2018.1451956