DOI QR코드

DOI QR Code

Methodology for Identifying Issues of User Reviews from the Perspective of Evaluation Criteria: Focus on a Hotel Information Site

사용자 리뷰의 평가기준 별 이슈 식별 방법론: 호텔 리뷰 사이트를 중심으로

  • Byun, Sungho (Graduate School of Business IT, Kookmin University) ;
  • Lee, Donghoon (Graduate School of Business IT, Kookmin University) ;
  • Kim, Namgyu (School of Management Information Systems, Kookmin University)
  • 변성호 (국민대학교 비즈니스IT전문대학원) ;
  • 이동훈 (국민대학교 비즈니스IT전문대학원) ;
  • 김남규 (국민대학교 비즈니스IT전문대학원)
  • Received : 2016.08.17
  • Accepted : 2016.09.24
  • Published : 2016.09.30

Abstract

As a result of the growth of Internet data and the rapid development of Internet technology, "big data" analysis has gained prominence as a major approach for evaluating and mining enormous data for various purposes. Especially, in recent years, people tend to share their experiences related to their leisure activities while also reviewing others' inputs concerning their activities. Therefore, by referring to others' leisure activity-related experiences, they are able to gather information that might guarantee them better leisure activities in the future. This phenomenon has appeared throughout many aspects of leisure activities such as movies, traveling, accommodation, and dining. Apart from blogs and social networking sites, many other websites provide a wealth of information related to leisure activities. Most of these websites provide information of each product in various formats depending on different purposes and perspectives. Generally, most of the websites provide the average ratings and detailed reviews of users who actually used products/services, and these ratings and reviews can actually support the decision of potential customers in purchasing the same products/services. However, the existing websites offering information on leisure activities only provide the rating and review based on one stage of a set of evaluation criteria. Therefore, to identify the main issue for each evaluation criterion as well as the characteristics of specific elements comprising each criterion, users have to read a large number of reviews. In particular, as most of the users search for the characteristics of the detailed elements for one or more specific evaluation criteria based on their priorities, they must spend a great deal of time and effort to obtain the desired information by reading more reviews and understanding the contents of such reviews. Although some websites break down the evaluation criteria and direct the user to input their reviews according to different levels of criteria, there exist excessive amounts of input sections that make the whole process inconvenient for the users. Further, problems may arise if a user does not follow the instructions for the input sections or fill in the wrong input sections. Finally, treating the evaluation criteria breakdown as a realistic alternative is difficult, because identifying all the detailed criteria for each evaluation criterion is a challenging task. For example, if a review about a certain hotel has been written, people tend to only write one-stage reviews for various components such as accessibility, rooms, services, or food. These might be the reviews for most frequently asked questions, such as distance between the nearest subway station or condition of the bathroom, but they still lack detailed information for these questions. In addition, in case a breakdown of the evaluation criteria was provided along with various input sections, the user might only fill in the evaluation criterion for accessibility or fill in the wrong information such as information regarding rooms in the evaluation criteria for accessibility. Thus, the reliability of the segmented review will be greatly reduced. In this study, we propose an approach to overcome the limitations of the existing leisure activity information websites, namely, (1) the reliability of reviews for each evaluation criteria and (2) the difficulty of identifying the detailed contents that make up the evaluation criteria. In our proposed methodology, we first identify the review content and construct the lexicon for each evaluation criterion by using the terms that are frequently used for each criterion. Next, the sentences in the review documents containing the terms in the constructed lexicon are decomposed into review units, which are then reconstructed by using the evaluation criteria. Finally, the issues of the constructed review units by evaluation criteria are derived and the summary results are provided. Apart from the derived issues, the review units are also provided. Therefore, this approach aims to help users save on time and effort, because they will only be reading the relevant information they need for each evaluation criterion rather than go through the entire text of review. Our proposed methodology is based on the topic modeling, which is being actively used in text analysis. The review is decomposed into sentence units rather than considering the whole review as a document unit. After being decomposed into individual review units, the review units are reorganized according to each evaluation criterion and then used in the subsequent analysis. This work largely differs from the existing topic modeling-based studies. In this paper, we collected 423 reviews from hotel information websites and decomposed these reviews into 4,860 review units. We then reorganized the review units according to six different evaluation criteria. By applying these review units in our methodology, the analysis results can be introduced, and the utility of proposed methodology can be demonstrated.

최근 IT기술의 발전에 따라 많은 사람들이 자신들의 여가활동에 대한 경험을 공유하고 있으며, 역으로 다른 사람들의 여가활동에 대한 경험을 참고하여 더 나은 여가활동을 누릴 수 있는 기회를 얻게 되었다. 이러한 현상은 영화, 숙박, 음식, 여행 등 여가활동 전반에 걸쳐 나타나고 있으며, 그 중심에는 여가활동에 대한 정보를 요약하여 제공하는 수많은 사이트가 있다. 대부분의 여가활동 정보 사이트는 각 상품에 대한 평균 평점뿐만 아니라 상세 리뷰를 제공함으로써, 해당 상품을 구매하고자 하는 잠재고객의 의사결정을 지원하고 있다. 하지만 기존 대부분의 사이트는 한 단계의 평가기준에 따라 평점과 리뷰를 제공하기 때문에, 각 평가기준을 구성하는 세부요소에 대한 특징과 평가기준 별 주요 이슈를 파악하기 위해서는 상당히 많은 수의 리뷰를 직접 읽어야 한다는 불편이 따른다. 즉 사용자는 자신이 중요한 것으로 생각하는 평가기준에 대한 조건을 파악하기 위해, 많은 수의 리뷰를 하나하나 읽어보는 과정에서 많은 시간과 노력을 소비하게 된다. 예를 들어 호텔의 접근성, 객실, 서비스, 음식 등 한 단계의 평가기준만을 사용하여 평점과 리뷰를 제공하는 사이트의 경우, 접근성 중 특히 지하철역과의 거리, 객실 중 특히 욕실의 상태를 살펴보고자 하는 사용자에게 필요한 정보를 충분히 제공하지 못하게 된다. 따라서 본 연구에서는 기존 여가활동 정보 사이트의 한계, 즉 평가기준별로 입력된 리뷰를 신뢰하기 어렵다는 점과 평가기준을 구성하고 있는 세부 내용을 파악하기 어렵다는 점을 극복하기 위한 방안을 제시하고자 한다. 본 연구에서 제안하는 방법론은 사용자가 별도의 구분 없이 입력한 리뷰를 그 내용에 따라 평가기준별로 자동 분류하고, 각 평가 기준 별 주요 이슈를 요약하여 제공한다. 제안 방법론은 최근 텍스트 분석에 활발하게 사용되고 있는 토픽 모델링(Topic Modeling)에 기반을 두고 있으며, 각 리뷰를 하나의 문서 단위로 사용하는 것이 아니라 리뷰를 문장 단위로 끊어 개별 리뷰 유닛(Review Unit)으로 분해한 뒤, 평가기준별로 리뷰 유닛을 재구성하여 분석한다는 측면에서 기존의 토픽 모델링 기반 연구와 큰 차이가 있다고 할 수 있다. 본 논문에서는 제안 방법론을 실제 호텔 정보 사이트에서 수집한 423건의 리뷰 문서에 적용하여 6가지 평가기준에 대해 총 4,860건의 리뷰 유닛을 재구성하고, 이에 대한 분석 결과를 소개함으로써 제안 방법론의 유용성을 간접적으로 보인다.

Keywords

References

  1. Archak, N., A. Ghose and P. G. Ipeirotis, "Deriving the Pricing Power of Product Features by Mining Consumer Reviews," Management Science, Vol.57, No.8(2011), 1485-1509. https://doi.org/10.1287/mnsc.1110.1370
  2. Bjorkelund, E., T. H. Burnett and K. Norvag, "A Study of Opinion Mining and Visualization of Hotel Reviews," In Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services (IIWAS '12), 2012.
  3. Buneman, P., "Semistructured data," In Proceedings of the Sixteenth ACM SIGACTSIGMOD-SIGART Symposium on Principles of Database Systems (PODS '97), 1997.
  4. Chae, S. H., J. I. Lim and J. Y. Kang, "A Comparative Analysis of Social Commerce and Open Market Using User Reviews in Korean Mobile Commerce," Journal of Intelligence and Information Systems, Vol.21, No.4(2015), 53-77. https://doi.org/10.13088/JIIS.2015.21.4.053
  5. Choi, J. U., H. J. Ryu, D. B. Yu, N. R. Kim and Y. H. Kim, "System Design for Analysis and Evaluation of E-commerce Products Using Review Sentiment Word Analysis," KIISE Transactions on Computing Practices, Vol.22, No.5(2016), 209-217. https://doi.org/10.5626/KTCP.2016.22.5.209
  6. Gamon, M., A. Aue, S. Corston-Oliver and E. Ringger, "Pulse: Mining Customer Opinions from Free Text," In Proceedings of the 6th International Conference on Advances in Intelligent Data Analysis (IDA '05), 2005.
  7. Jeon, B. K. and H. C. Ahn, "A Collaborative Filtering System Combined with Users' Review Mining : Application to the Recommendation of Smartphone Apps," Journal of Intelligence and Information Systems, Vol.21, No.2(2015), 1-18. https://doi.org/10.13088/jiis.2015.21.2.01
  8. Kim, J. Y. and D. S. Kim, "A Study on the Method for Extracting the Purpose-Specific Customized Information from Online Product Reviews based on Text Mining," The Journal of Society for e-Business Studies, Vol.21, No.2(2016), 151-161.
  9. Kim, K. H., "Design and Implementation Online Customer Reviews Analysis System based on Dependency Network Model," The Journal of the Korea Contents Association, Vol.10, No.11(2010), 30-37. https://doi.org/10.5392/JKCA.2010.10.11.030
  10. Lee, S. H., J. Cui and J. W. Kim, "Sentiment analysis on movie review through building modified sentiment dictionary by movie genre," Journal of Intelligence and Information Systems, Vol.22, No.2(2016), 97-113. https://doi.org/10.13088/jiis.2016.22.2.097
  11. Lee, Y. J., J. H. Ji, G. Woo and H. G. Cho, "TRIB: A Clustering and Visualization System for Responding Comments on Blogs," The KIPS Transactions : Part D, Vol.16, No.5(2009), 817-824.
  12. Lee, Y. J., I. J. Jung and G. Woo, "Extracting and Visualizing Dispute comments and Relations on Internet Forum Site," The Journal of the Korea Contents Association, Vol.12, No.2 (2012), 40-51. https://doi.org/10.5392/JKCA.2012.12.02.040
  13. Liu, B., M. Hu and J. Cheng, "Opinion Observer: Analyzing and Comparing Opinions on the Web," In Proceedings of the 14th International Conference on World Wide Web (WWW '05), 2005, 342-351.
  14. Liu, C. and N. Kim, "Methodology for Improving the Reliability of the Rating System for Leisure Activity Information Sites : Focusing on a Movie Information Site," Journal of Tourism and Leisure Research, Vol.27, No.7(2015), 187-200.
  15. Mun, S. M., G. N. Kim, G. C. Choi and K. W. Lee, "Movie Recommended System base on Analysis for the User Review utilizing Ontology Visualization," Design Convergence Study, Vol.15, No.2(2016), 347-368.
  16. Scaffidi, C., K. Bierhoff, E. Chang, M. Felker, H. Ng and C. Jin, "Red Opal: Product-Feature Scoring from Reviews," In Proceedings of the 8th ACM Conference on Electronic Commerce (EC '07), 2007.
  17. Yang, J. Y., J. S. Myung and S. G. Lee, "A Product Review Summarization System Using a Scoring of Features," The Journal of Society for e-Business Studies Symposium and other publications, Society for e-Business Studies, 2008.
  18. Yang, J. Y., J. S. Myung and S. G. Lee, "A Sentiment Classification Method Using Context. Information in Product Review Summarization," Journal of KISS: Databases, Vol.36, No.4(2009), 254-262.
  19. Yeon, J. H., D. J. Lee, J. H. Shim and S. G. Lee, "Product Review Data and Sentiment Analytical Processing Modeling," The Journal of Society for e-Business Studies, Vol.16, No.4(2011), 125-137.
  20. Yeon. H. B., S. J. Yoo, H. S. Jang, D. I. Han and Y. Jang, "Design and Implementation of a Web Crawling System for the Reviews of Korean Restaurants in the U.S.," Korea Computer Congress Symposium, Vol.2013, No.6(2013), 283-285.

Cited by

  1. 전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안 vol.23, pp.3, 2016, https://doi.org/10.13088/jiis.2017.23.3.069
  2. 한국표준산업분류를 기준으로 한 문서의 자동 분류 모델에 관한 연구 vol.24, pp.3, 2016, https://doi.org/10.13088/jiis.2018.24.3.221
  3. Perception and Appraisal of Urban Park Users Using Text Mining of Google Maps Review - Cases of Seoul Forest, Boramae Park, Olympic Park - vol.49, pp.4, 2021, https://doi.org/10.9715/kila.2021.49.4.015
  4. 구글맵리뷰 텍스트마이닝을 활용한 공원 이용자의 인식 및 평가 - 서울숲, 보라매공원, 올림픽공원을 대상으로 - vol.49, pp.4, 2021, https://doi.org/10.9715/kila.2021.49.4.15