DOI QR코드

DOI QR Code

An Opinion Document Clustering Technique for Product Characterization

제품 특징화를 위한 오피니언 문서의 클러스터링 기법

  • Received : 2014.03.14
  • Accepted : 2014.04.24
  • Published : 2014.05.31

Abstract

Opinion Mining is one of the application domains of text mining which extracting opinions from documents, and much researches are currently underway. Most of related researches focused on the sentiment classification which classifies the documents into positive/negative opinions. However, there is a little interest in extracting the features characterizing the individual product. In this paper, we propose the technique classifying the opinion documents according to the product features, and selecting the those features characterizing each product. In the proposed method, we utilize the document clustering technique and develope a new algorithm for evaluating the similarity between documents. In addition, through experiments, we prove the usefulness of proposed method.

오피니언 마이닝은 문서로부터 의견을 추출하는 텍스트 마이닝의 응용분야로 현재 활발한 연구가 진행되고 있다. 대부분의 관련 연구는 특정 제품군에 대해서 주어진 특징별로 긍정과 부정 평가를 나누는 감성분류에 초점을 맞추고 있다. 하지만 제품별로 강조되는 특성들을 구별해내는 연구는 거의 이루어지고 있지 않다. 본 논문에서는 특성별로 오피니언 문서들을 분류하고, 이를 이용하여 특정 제품군에 대해서 제품별로 강조되는 특성들을 선별하는 기법을 제안한다. 제안된 기법에서는 텍스트 클러스터링을 활용하였으며, 새로운 유사도 계산 방식을 사용하였다. 또한 실험을 통하여 제안된 방법의 유용성을 증명하였다.

Keywords

References

  1. Liu, B., Hu, M., and Cheng, J., "Opinion observer : analyzing and comparing opinions on the Web," Proceedings of the 14th international conference on WWW, pp. 10-14, 2005.
  2. Scaffdi, C., Bierhoff, K., Chang, E., Felker, M., Ng, H., and Jin, C., "Red Opal : Product-Feature Scoring from Reviews," Proceedings of the 8th ACM conference on Electronic commerce, pp. 11-15, 2007.
  3. Xiaowen Ding, and Bing Lui, "The Utility of Linguistic Rules in Opinion Mining," SIGIR 2007, pp. 811-812, 2007.
  4. Courses, E. and Surveys, T., "Using Senti-WordNet for multilingual sentiment analysis," IEEE 24th International Conference on Data Engineering Workshop, ICDEW 2008, 2008.
  5. Popescu, A. O., "Extracting product features and opinions from reviews," Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 339-396, 2005.
  6. Liu, J., Cao, Y., Lin, C., Huang, Y., and Zhou, M., "Low-Quality Product Review Detection in Opinion Summarization," Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 334-342, 2007.
  7. Pak, A. and Paroubek, P., "Twitter as a Corpus for Sentiment Analysis and Opinion Mining," Proceedings of The International Conference on Language Resources and Evaluation, pp. 1320-1326, 2010.
  8. Tan, P., Steinbach, M., and Kumar, V., Introduction to Data Mining, Addision-Wesley, 2006.
  9. Zhai, Z., Liu, B., Xu, H., and Jia, P., "Clustering Product Features for Opinion Mining," Proceedings of the fourth ACM international conference on Web search and data mining, pp. 347-354, 2011.
  10. Ahmad, T., "Clustering Technique for Feature Segregation in Opinion Analysis," International Journal of Computer Applications, Vol. 76, No. 17, pp. 43-49, 2013.
  11. Hu, M. and Liu, B., "Mining opinion features in customer reviews," Proceedings of the 19th national conference on Artificial intelligence, pp. 755-760.
  12. Liu, B., Web Data Mining : Exploring hyperlinks, contents, and usage data, Springer, 2006.
  13. Mo-The movie ontology, http://www.movieontology.org/.
  14. Unified Medical Language System, http://www.nlm.nih.gov/research/umls/.
  15. Rho, J.-H., Kim, H., and Chang, J.-Y., Improving Hypertext Classification Systems through WordNet-based Feature Abstraction, The Journal of Society for e-Business Studies, Vol. 18, No. 2, 2013. https://doi.org/10.7838/jsebs.2013.18.2.095

Cited by

  1. Analyzing Customer Feedback Differences between VOCs and External Channels vol.41, pp.3, 2018, https://doi.org/10.11627/jkise.2018.41.3.129