Customer Classification and Market Basket Analysis Using K-Means Clustering and Association Rules: Evidence from Distribution Big Data of Korean Retailing Company

군집분석과 연관규칙을 활용한 고객 분류 및 장바구니 분석: 소매 유통 빅데이터를 중심으로

  • Received : 2018.08.29
  • Accepted : 2018.11.25
  • Published : 2018.12.31


With the arrival of the big data era, customer data and data mining analysis have gradually dominated the process of Customer Relationship Management (CRM). This phenomenon indicates that customer data along with the use of information techniques (IT) have become the basis for building a successful CRM strategy. However, some companies can not discover valuable information through a large amount of customer data, which leads to the failure of making appropriate business strategy. Without suitable strategies, the companies may lose the competitive advantage or probably go bankrupt. The purpose of this study is to propose CRM strategies by segmenting customers into VIPs and Non-VIPs and identifying purchase patterns using the the VIPs' transaction data and data mining techniques (K-means clustering and association rules) of online shopping mall in Korea. The results of this paper indicate that 227 customers were segmented into VIPs among 1866 customers. And according to 51,080 transactions data of VIPs, home product and women wear are frequently associated with food, which means that the purchase of home product or women wears mainly affect the purchase of food. Therefore, marketing managers of shopping mall should consider these shopping patterns when they build CRM strategy.


  1. 권재현, 최영준 2016. "은행의 고객관계관리와 학습능력이 조직혁신성에 미치는 영향," 지식경영연구 (제17:3호) 227-248.
  2. 강수영, 오평석, 김상만 2011. "고객 지식을 활용한 병원 CRM활동이 고객관계상태 및 향후 행동 의도에 미치는 영향," 지식경영연구 (12:3) 39-58.
  3. Bansal, A., Sharma, M., and Goel, S. 2017. Improved K-means Clustering Algorithm for Prediction Analysis Using Classification Technique in Data Mining," International Journal of Computer Applications (157:6) pp. 0975-8887.
  4. Chen, Y. L., Kuo, M. H., Wu, S. Y., and Tang, K. 2009. "Discovering Recency, Frequency, and Monetary (RFM) Sequential Patterns from Customers' Purchasing Data," Electronic Commerce Research and Applications (8:5), pp. 241-251.
  5. Hahsler, M., and Karpienko, R. 2017. "Visualizing Association Rules in Hierarchical Groups," Journal of Business Economics (87:3) pp. 317-335.
  6. He, Z., Xu, X., Huang, J. Z., and Deng, S. 2004. "Mining Class Outliers: Concepts, Algorithms and Applications in CRM," Expert Systems with Applications (27:4), pp. 681-697.
  7. Hosseini, S. M. S., Maleki, A., and Gholamian, M. R. 2010. "Cluster Analysis Using Data Mining Approach to Develop CRM Methodology to Assess the Customer Loyalty," Expert Systems with Applications (37:7), pp. 5259-5264.
  8. Kantardzic, M. 2003. Data Mining-Concepts, Models, Methods, and Algorithms, John Wiley & Sons.
  9. Kaur, M., and Kang, S. 2016. "Market Basket Analysis: Identify the Changing Trends of Market Data using Association Rule Mining," Procedia Computer Science (85), pp. 78-85.
  10. Kaymak, U. 2001. "Fuzzy Target Selection Using RFM Variables," IFSA World Congress and 20th NAFIPS International Conference, IEEE (2), pp. 1038-1043.
  11. Keramati, A., Jafari-Marandi, R., Aliannejadi, M., Ahmadian, I., Mozaffari, M., and Abbasi, U. 2014. "Improved Churn Prediction in Telecommunication Industry Using Data Mining Techniques," Applied Soft Computing (24), pp. 994-1012.
  12. Lee, Y. C., and Shin, S. I. 2003. "Mining Association Rules of Credit Card Delinquency of Bank Customers in Large Databases," Journal of Intelligence and Information Systems (9:2), pp. 135-154.
  13. Ravasan, A. Z., Mansouri, T. 2018. "A Fuzzy ANP Based Weighted RFM Model for Customer Segmentation in Auto Insurance Sector," Intelligent Systems: Concepts, Methodologies, Tools, and Applications. IGI Global, pp. 1050-1067.
  14. Sheu, J. J., Chu, K. T., and Wang, S. M. 2017. "The Associate Impact of Individual Internal Experiences and Reference Groups on Buying Behavior: A Case Study of Animations, Comics, and Games Consumers," Telematics and Informatics (34:4), pp. 314-325.
  15. Shim, B., Choi, K., and Suh, Y. 2012. "CRM Strategies for A Small-sized Online Shopping Mall Based on Association Rules and Sequential Patterns," Expert Systems with Applications (39:9), pp. 7736-7742.
  16. Swift, R. S. 2001. Accelerating Customer Relationships: Using CRM and Relationship Technologies, Prentice Hall Professional.
  17. Teo, T. S. H., Devadoss, P., and Pan, S. L. 2006. "Towards A Holistic Perspective of Customer Relationship Management (CRM) Implementation: A Case Study of the Housing and Development Board, Singapore," Decision Support Systems (42:3), pp. 1613-1627.
  18. Tsai, C. F., and Chen, M. Y. 2010. "Variables Selection by Association Rules for Customer Churn Prediction of Multimedia on Demand," Expert Systems with Applications (37:3), pp. 2006-2015.
  19. Turban, E., Sharda, R., and Delen, D. 2011. Decision Support and Business Intelligence Systems, Pearson Education India.
  20. Wu, C. H., Kao, S. C., Su, Y. Y., and Wu, C. C. 2005. "Targeting Customers Via Discovery Knowledge for the Insurance Industry," Expert Systems with Applications (29:2), pp. 291-299.