DOI QR코드

DOI QR Code

An Application of Support Vector Machines to Customer Loyalty Classification of Korean Retailing Company Using R Language

  • Nguyen, Phu-Thien (International Business Cooperative Course, Graduate School of Dongguk University) ;
  • Lee, Young-Chan (Dept. of Business Administration, Dongguk University)
  • Received : 2017.11.13
  • Accepted : 2017.12.18
  • Published : 2017.12.31

Abstract

Purpose Customer Loyalty is the most important factor of customer relationship management (CRM). Especially in retailing industry, where customers have many options of where to spend their money. Classifying loyal customers through customers' data can help retailing companies build more efficient marketing strategies and gain competitive advantages. This study aims to construct classification models of distinguishing the loyal customers within a Korean retailing company using data mining techniques with R language. Design/methodology/approach In order to classify retailing customers, we used combination of support vector machines (SVMs) and other classification algorithms of machine learning (ML) with the support of recursive feature elimination (RFE). In particular, we first clean the dataset to remove outlier and impute the missing value. Then we used a RFE framework for electing most significant predictors. Finally, we construct models with classification algorithms, tune the best parameters and compare the performances among them. Findings The results reveal that ML classification techniques can work well with CRM data in Korean retailing industry. Moreover, customer loyalty is impacted by not only unique factor such as net promoter score but also other purchase habits such as expensive goods preferring or multi-branch visiting and so on. We also prove that with retailing customer's dataset the model constructed by SVMs algorithm has given better performance than others. We expect that the models in this study can be used by other retailing companies to classify their customers, then they can focus on giving services to these potential vip group. We also hope that the results of this ML algorithm using R language could be useful to other researchers for selecting appropriate ML algorithms.

Keywords

References

  1. Altman, E. I., "Financial ratios, discriminant analysis and the prediction of corporate bankruptcy," The Journal of Finance, Vol. 23, No. 4, 1968, pp. 589-609. https://doi.org/10.1111/j.1540-6261.1968.tb00843.x
  2. Ball, D., Coelho, P. S., and Machas, A., "The role of communication and trust in explaining customer loyalty: An extension to the ECSI model," European Journal of Marketing, Vol. 38, No. 9/10, 2004, pp. 1272-1293. https://doi.org/10.1108/03090560410548979
  3. Bensic, M., Sarlija, N., and Zekic-Susac, M., "Modelling small-business credit scoring by using logistic regression, neural networks and decision trees," Intelligent Systems in Accounting, Finance and Management, Vol. 13, No. 3, 2005, pp. 133-150. https://doi.org/10.1002/isaf.261
  4. Blum, A., and Mitchell, T., "Combining labeled and unlabeled data with co-training," Proceedings of the Eleventh Annual Conference on Computational Learning Theory, ACM, 1998.
  5. Breiman, L., "Bagging predictors," Machine Learning, Vol. 24, No. 2, 1996, pp. 123-140. https://doi.org/10.1023/A:1018054314350
  6. Breiman, L., "Random forests," Machine Learning, Vol. 45, No. 1, 2001, pp. 5-32. https://doi.org/10.1023/A:1010933404324
  7. Coussement, K., and Van den Poel, D., "Churn prediction in subscription services: An application of support vector machines while comparing two parameterselection techniques," Expert Systems with Applications, Vol. 34, No. 1, 2008, pp. 313-327. https://doi.org/10.1016/j.eswa.2006.09.038
  8. Cui, D., and Curry, D., "Prediction in marketing using the support vector machine," Marketing Science, Vol. 24, No. 4, 2005, pp. 595-615. https://doi.org/10.1287/mksc.1050.0123
  9. Deakin, E. B., "A discriminant analysis of predictors of business failure," Journal of Accounting Research, Vol. 10, 1972, pp. 167-179. https://doi.org/10.2307/2490225
  10. Delen, D., "A comparative analysis of machine learning techniques for student retention management," Decision Support Systems, Vol. 49, No. 4, 2010, pp. 498-506. https://doi.org/10.1016/j.dss.2010.06.003
  11. Dudyala, A. K. and Ravi, V., "Predicting credit card customer churn in banks using data mining," International Journal of Data Analysis Techniques and Strategies, Vol. 1, No. 1, 2008, pp 4-28. https://doi.org/10.1504/IJDATS.2008.020020
  12. Farquad, M. A. H., Ravi, V., and Raju, S. B., "Data mining using rules extracted from SVM: an application to churn prediction in bank credit cards," Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, Vol. 5908, 2009, pp. 390-397.
  13. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D., "Object detection with discriminatively trained part-based models," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, 2010, pp. 1627-1645. https://doi.org/10.1109/TPAMI.2009.167
  14. Fisher, R. A., "The use of multiple measurements in taxonomic problems," Annals of Human Genetics, Vol. 7, No. 2, 1936, pp. 179-188.
  15. Goodhue, D. L., Wixom, B. H., and Watson, H. J., "Realizing business benefits through CRM: hitting the right target in the right way." MIS Quarterly Executive, Vol. 1, No. 2, 2002, pp. 79-94.
  16. Granitto, P. M., Furlanello, C., Biasioli, F., and Gasperi, F., "Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products," Chemometrics and Intelligent Laboratory Systems, Vol. 83, No. 2, 2006, pp. 83-90. https://doi.org/10.1016/j.chemolab.2006.01.007
  17. Gremler, D. D., and Brown, S. W., "Service loyalty: its nature, importance, and implications," Advancing Service Quality: A Global Perspective, Vol. 5, 1996, pp. 171-181.
  18. Guyon, I., and Elisseeff, A., "An introduction to variable and feature selection," Journal of Machine Learning Research, Vol. 3, 2003, pp. 1157-1182.
  19. He, Z., Xu, X., Huang, J. Z., and Deng, S., "Mining class outliers: concepts, algorithms and applications in CRM," Expert Systems with Applications, Vol. 27, No. 4, 2004, pp. 681-697. https://doi.org/10.1016/j.eswa.2004.07.002
  20. Ho, T. K., "Random decision forests," Proceedings of the Third International Conference on, 1995, pp. 278-282.
  21. Hosseini, S. M. S., Maleki, A., and Gholamian, M. R., "Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty," Expert Systems with Applications, Vol. 37, No. 7, 2010, pp. 5259-5264. https://doi.org/10.1016/j.eswa.2009.12.070
  22. Hsu, C. W., Chang, C. C., and Lin, C. J., "A practical guide to support vector classification," 2003.
  23. Hu, C., Wang, J., Zheng, C., Xu, S., Zhang, H., Liang, Y., Bi, L., Fan, Z., Han, B., and Xu, W., "Raman spectra exploring breast tissues: Comparison of principal component analysis and support vector machine recursive feature elimination," Medical Physics, Vol. 40, No. 6, 2013, pp. 063501. https://doi.org/10.1118/1.4804054
  24. Hung, S. Y., Yen, D. C., and Wang, H. Y., "Applying data mining to telecom churn management," Expert Systems with Applications, Vol. 31, No. 3, 2006, pp. 515-524. https://doi.org/10.1016/j.eswa.2005.09.080
  25. Joachims, T., "Text categorization with support vector machines: Learning with many relevant features," Machine Learning: ECML-98, 1998, pp. 137-142.
  26. Johannes, M., Brase, J. C., Frohlich, H., Gade, S., Gehrmann, M., Falth, M., Sultmann, H., and BeiBbarth, T., "Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients," Bioinformatics, Vol. 26, No. 17, 2010, pp. 2136-2144. https://doi.org/10.1093/bioinformatics/btq345
  27. Keiningham, T. L., Cooil, B., Aksoy, L., Andreassen, T. W., and Weiner, J., "The value of different customer satisfaction and loyalty metrics in predicting customer retention, recommendation, and share-of-wallet," Managing Service Quality: An International Journal, Vol. 17, No. 4, 2007, pp. 361-384. https://doi.org/10.1108/09604520710760526
  28. Kim, S. A., Kim, J. W., Won, D. Y., and Choi, Y. R., "A Halal Food Classification Framework Using Machine Learning Method for Enhancing Muslim Tourists," The Journal of Information Systems, Vol. 26, No. 3, 2017, pp. 273-293.
  29. Kotler, P., and Armstrong, G., Principles of Marketing. Pearson education, 2010.
  30. Kumar, V., Customer Relationship Management. John Wiley & Sons, Ltd, 2010.
  31. Lee, M. H., "Loyalty of On-line Stock Trading Customers," The Journal of Information Systems, Vol. 14, No. 2, 2005, pp. 155-172. https://doi.org/10.2308/jis.2000.14.2.155
  32. Leslie, C., Eskin, E., and Noble, W. S., "The spectrum kernel: A string kernel for SVM protein classification," Pacific Symposium on Biocomputing, Vol. 7, 2002, pp. 566-575
  33. Li, H., and Sun, J., "Empirical research of hybridizing principal component analysis with multivariate discriminant analysis and logistic regression for business failure prediction," Expert Systems with Applications, Vol. 38, No. 5, 2011, pp. 6244-6253. https://doi.org/10.1016/j.eswa.2010.11.043
  34. Lin, H. H., and Wang, Y. S., "An examination of the determinants of customer loyalty in mobile commerce contexts," Information & Management, Vol. 43, No. 3, 2006, pp. 271-282. https://doi.org/10.1016/j.im.2005.08.001
  35. Louw, N., and Steel, S. J., "Variable selection in kernel Fisher discriminant analysis by means of recursive feature elimination," Computational Statistics & Data Analysis, Vol. 51, No. 3, 2006, pp. 2043-2055. https://doi.org/10.1016/j.csda.2005.12.018
  36. Michel, P., and El Kaliouby, R., "Real time facial expression recognition in video using support vector machines," Proceedings of the 5th International Conference on Multimodal Interfaces. ACM, 2003, pp. 258-264
  37. Min, J. H., and Lee, Y. C., "Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters," Expert Systems with Applications, Vol. 28, No. 4, 2005, pp. 603-614. https://doi.org/10.1016/j.eswa.2004.12.008
  38. Samuel, A. L. "Some studies in machine learning using the game of checkers," IBM Journal of Research and Development, Vol. 3, No. 3, 1959, pp. 210-229. https://doi.org/10.1147/rd.33.0210
  39. Shmueli, G., Patel, N. R., and Bruce, P. C., Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. John Wiley & Sons, 2008.
  40. So, S. H., Ryu, I., Cho, G., and Park, Y. S., "Structural Relationships of Logistics Service Quality, Relationship Orientation, Customer Satisfaction and Customer Loyalty in Electronic Commerce," The Journal of Information Systems, Vol. 16, No. 4, 2007, pp. 107-129.
  41. Song, F., Mei, D., and Li, H., "Feature selection based on linear discriminant analysis," Intelligent System Design and Engineering Application (ISDEA), 2010 International Conference on, Vol. 1, 2010, pp. 746-749.
  42. Stuhlsatz, A., Lippel, J., and Zielke, T., "Feature extraction with deep neural networks by a generalized discriminant analysis," IEEE transactions on neural networks and learning systems, Vol. 23, No. 4, 2012, pp. 596-608. https://doi.org/10.1109/TNNLS.2012.2183645
  43. Teo, T. S., Devadoss, P., and Pan, S. L., "Towards a holistic perspective of customer relationship management (CRM) implementation: A case study of the Housing and Development Board, Singapore," Decision Support Systems, Vol. 42, No. 3, 2006, pp. 1613-1627. https://doi.org/10.1016/j.dss.2006.01.007
  44. Tu, J. V., "Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes," Journal of Clinical Epidemiology, Vol. 49, No. 11, 1996, pp. 1225-1231. https://doi.org/10.1016/S0895-4356(96)00002-9
  45. Wood, E. H., "The internal predictors of business performance in small firms: A logistic regression analysis," Journal of Small Business and Enterprise Development, Vol. 13, No. 3, 2006, pp. 441-453. https://doi.org/10.1108/14626000610680299
  46. Zaki, M., Kandeil, D., Neely, A., and McColl-Kennedy, J. R., The Fallacy of the Net Promoter Score: Customer Loyalty Predictive Model, Cambridge Service Alliance, University of Cambridge, 2016.
  47. Zhao, W., Chellappa, R., and Krishnaswamy, A., "Discriminant analysis of principal components for face recognition," Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on, IEEE, 1998.