DOI QR코드

DOI QR Code

Comparative Study of Dimension Reduction Methods for Highly Imbalanced Overlapping Churn Data

  • Lee, Sujee (Department of Industrial Engineering, Seoul National University) ;
  • Koo, Bonhyo (Department of Industrial Engineering, Seoul National University) ;
  • Jung, Kyu-Hwan (Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd.)
  • Received : 2014.11.16
  • Accepted : 2014.12.01
  • Published : 2014.12.30

Abstract

Retention of possible churning customer is one of the most important issues in customer relationship management, so companies try to predict churn customers using their large-scale high-dimensional data. This study focuses on dealing with large data sets by reducing the dimensionality. By using six different dimension reduction methods-Principal Component Analysis (PCA), factor analysis (FA), locally linear embedding (LLE), local tangent space alignment (LTSA), locally preserving projections (LPP), and deep auto-encoder-our experiments apply each dimension reduction method to the training data, build a classification model using the mapped data and then measure the performance using hit rate to compare the dimension reduction methods. In the result, PCA shows good performance despite its simplicity, and the deep auto-encoder gives the best overall performance. These results can be explained by the characteristics of the churn prediction data that is highly correlated and overlapped over the classes. We also proposed a simple out-of-sample extension method for the nonlinear dimension reduction methods, LLE and LTSA, utilizing the characteristic of the data.

Keywords

References

  1. Bengio, Y. (2007), Learning deep architectures for AI, Technical Report 1312, Universite de Montreal, Canada.
  2. Bengio, Y., Paiement, J. F., Vincent, P., Delalleau, O., Le Roux, N., and Ouimet, M. (2004), Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering, Advances in Neural Information Processing Systems, 16, 177-184.
  3. Bhattacharya, C. B. (1998), When customers are members: customer retention in paid membership contexts, Journal of the Academy of Marketing Science, 26(1), 31-44. https://doi.org/10.1177/0092070398261004
  4. Chang, C. C. and Lin, C. J. (2011), LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2(3), 27.
  5. Ghahramani, Z. and Hinton, G. E. (1996), The EM algorithm for mixtures of factor analyzers, Technical Report CRG-TR-96-1, University of Toronto, Canada.
  6. He, X. and Niyogi, P. (2004), Locality preserving projections, Advances in Neural Information Processing Systems, 16, 153-160.
  7. Hinton, G. E. and Salakhutdinov, R. R. (2006), Reducing the dimensionality of data with neural networks, Science, 313(5786), 504-507. https://doi.org/10.1126/science.1127647
  8. Hotelling, H. (1933), Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 24(6), 417-441. https://doi.org/10.1037/h0071325
  9. Hsu, C. W., Chang, C. C., and Lin, C. J. (2003), A practical guide to support vector classification, Technical Report, Department of Computer Science, National Taiwan University, Taiwan.
  10. Kaiser, H. F. (1960), The application of electronic computers to factor analysis, Educational and Psychological Measurement, 20, 141-151. https://doi.org/10.1177/001316446002000116
  11. Kim, K. and Lee, J. (2012), Sequential manifold learning for efficient churn prediction, Expert Systems with Applications, 39(18), 13328-13337. https://doi.org/10.1016/j.eswa.2012.05.069
  12. Kim, N., Jung, K. H., Kim, Y. S., and Lee, J. (2012), Uniformly subsampled ensemble (USE) for churn management: theory and implementation, Expert Systems with Applications, 39(15), 11839-11845. https://doi.org/10.1016/j.eswa.2012.01.203
  13. Kim, Y. (2006), Toward a successful CRM: variable selection, sampling, and ensemble, Decision Support Systems, 41(2), 542-553. https://doi.org/10.1016/j.dss.2004.09.008
  14. Lee, H., Lee, Y., Cho, H., Im, K., and Kim, Y. S. (2011), Mining churning behaviors and developing retention strategies based on a partial least squares (PLS) model, Decision Support Systems, 52(1), 207-216. https://doi.org/10.1016/j.dss.2011.07.005
  15. Levina, E. and Bickel, P. J. (2004), Maximum likelihood estimation of intrinsic dimension, Advances in Neural Information Processing Systems, 17, 777-784.
  16. Pearson, K. (1901), On lines and planes of closest fit to systems of points in space, Philosophical Magazine Series 6, 2(11), 559-572. https://doi.org/10.1080/14786440109462720
  17. Reinartz, W., Krafft, M., and Hoyer, W. D. (2004), The customer relationship management process: its measurement and impact on performance, Journal of Marketing Research, 41(3), 293-305. https://doi.org/10.1509/jmkr.41.3.293.35991
  18. Rosset, S., Neumann, E., Eick, U., Vatnik, N., and Idan, I. (2001), Evaluation of prediction models for marketing campaigns, Proceedings of the 7th ACM SIG KDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, 456-461.
  19. Rossi, P. E., McCulloch, R., and Allenby, G. (1996), The value of household information in target marketing, Marketing Science, 15(3), 321-340. https://doi.org/10.1287/mksc.15.4.321
  20. Roweis, S. T. and Saul, L. K. (2000), Nonlinear dimensionality reduction by locally linear embedding, Science, 290(5500), 2323-2326. https://doi.org/10.1126/science.290.5500.2323
  21. Spearman, C. (1904), 'General intelligence,' objectively determined and measured, American Journal of Psychology, 15(2), 201-292. https://doi.org/10.2307/1412107
  22. van der Maaten, L. J., Postma, E. O., and van den Herik, H. J. (2009), Dimensionality reduction: a comparative review, Journal of Machine Learning Research, 10(1-41), 66-71.
  23. Zhang, Z. and Zha, H. (2002), Principal manifolds and nonlinear dimensionality reduction via tangent space alignment, SIAM Journal of Scientific Computing, 26(1), 313-338.