• Title/Summary/Keyword: Churn Prediction

Search Result 33, Processing Time 0.027 seconds

A MapReduce-based Artificial Neural Network Churn Prediction for Music Streaming Service

  • Chen, Min
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.1
    • /
    • pp.55-60
    • /
    • 2022
  • Churn prediction is a critical long-term problem for many business like music, games, magazines etc. The churn probability can be used to study many aspects of a business including proactive customer marketing, sales prediction, and churn-sensitive pricing models. It is quite challenging to design machine learning model to predict the customer churn accurately due to the large volume of the time-series data and the temporal issues of the data. In this paper, a parallel artificial neural network is proposed to create a highly-accurate customer churn model on a large customer dataset. The proposed model has achieved significant improvement in the accuracy of churn prediction. The scalability and effectiveness of the proposed algorithm is also studied.

A Methodology of Customer Churn Prediction based on Two-Dimensional Loyalty Segmentation (이차원 고객충성도 세그먼트 기반의 고객이탈예측 방법론)

  • Kim, Hyung Su;Hong, Seung Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.111-126
    • /
    • 2020
  • Most industries have recently become aware of the importance of customer lifetime value as they are exposed to a competitive environment. As a result, preventing customers from churn is becoming a more important business issue than securing new customers. This is because maintaining churn customers is far more economical than securing new customers, and in fact, the acquisition cost of new customers is known to be five to six times higher than the maintenance cost of churn customers. Also, Companies that effectively prevent customer churn and improve customer retention rates are known to have a positive effect on not only increasing the company's profitability but also improving its brand image by improving customer satisfaction. Predicting customer churn, which had been conducted as a sub-research area for CRM, has recently become more important as a big data-based performance marketing theme due to the development of business machine learning technology. Until now, research on customer churn prediction has been carried out actively in such sectors as the mobile telecommunication industry, the financial industry, the distribution industry, and the game industry, which are highly competitive and urgent to manage churn. In addition, These churn prediction studies were focused on improving the performance of the churn prediction model itself, such as simply comparing the performance of various models, exploring features that are effective in forecasting departures, or developing new ensemble techniques, and were limited in terms of practical utilization because most studies considered the entire customer group as a group and developed a predictive model. As such, the main purpose of the existing related research was to improve the performance of the predictive model itself, and there was a relatively lack of research to improve the overall customer churn prediction process. In fact, customers in the business have different behavior characteristics due to heterogeneous transaction patterns, and the resulting churn rate is different, so it is unreasonable to assume the entire customer as a single customer group. Therefore, it is desirable to segment customers according to customer classification criteria, such as loyalty, and to operate an appropriate churn prediction model individually, in order to carry out effective customer churn predictions in heterogeneous industries. Of course, in some studies, there are studies in which customers are subdivided using clustering techniques and applied a churn prediction model for individual customer groups. Although this process of predicting churn can produce better predictions than a single predict model for the entire customer population, there is still room for improvement in that clustering is a mechanical, exploratory grouping technique that calculates distances based on inputs and does not reflect the strategic intent of an entity such as loyalties. This study proposes a segment-based customer departure prediction process (CCP/2DL: Customer Churn Prediction based on Two-Dimensional Loyalty segmentation) based on two-dimensional customer loyalty, assuming that successful customer churn management can be better done through improvements in the overall process than through the performance of the model itself. CCP/2DL is a series of churn prediction processes that segment two-way, quantitative and qualitative loyalty-based customer, conduct secondary grouping of customer segments according to churn patterns, and then independently apply heterogeneous churn prediction models for each churn pattern group. Performance comparisons were performed with the most commonly applied the General churn prediction process and the Clustering-based churn prediction process to assess the relative excellence of the proposed churn prediction process. The General churn prediction process used in this study refers to the process of predicting a single group of customers simply intended to be predicted as a machine learning model, using the most commonly used churn predicting method. And the Clustering-based churn prediction process is a method of first using clustering techniques to segment customers and implement a churn prediction model for each individual group. In cooperation with a global NGO, the proposed CCP/2DL performance showed better performance than other methodologies for predicting churn. This churn prediction process is not only effective in predicting churn, but can also be a strategic basis for obtaining a variety of customer observations and carrying out other related performance marketing activities.

Development of Prediction Model for Churn Agents -Comparing Prediction Accuracy Between Pattern Model and Matrix Model- (대리점 이탈예측모델 개발 - 동적모델(Pattern Model)과 정적모델(Matrix Model)의 예측적중률 비교 -)

  • An, Bong-Rak;Lee, Sae-Bom;Roh, In-Sung;Suh, Yung-Ho
    • Journal of Korean Society for Quality Management
    • /
    • v.42 no.2
    • /
    • pp.221-234
    • /
    • 2014
  • Purpose: The Purpose of this study is to develop a model for predicting agent churn group in the cosmetics industry. We develope two models, pattern model and matrix model, which are compared regarding the prediction accuracy of churn agents. Finally, we try to conclude if there is statistically significant difference between two models by empirical study. Methods: We develop two models using the part of RFM(Recency, Frequency, Monetary) method which is one of customer segmentation method in traditional CRM study. In order to ensure which model can predict churn agents more precisely between two models, we used CRM data of cosmetics company A in China. Results: Pattern model and matrix model have been developed. we find out that there is statistically significant differences between two models regarding the prediction accuracy. Conclusion: Pattern model and matrix model predict churn agents. Although pattern model employed the trend of monetary mount for six months, matrix model that used the amount of sales per month and the duration of the employment is better than pattern model in prediction accuracy.

A Securities Company's Customer Churn Prediction Model and Causal Inference with SHAP Value (증권 금융 상품 거래 고객의 이탈 예측 및 원인 추론)

  • Na, Kwangtek;Lee, Jinyoung;Kim, Eunchan;Lee, Hyochan
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.215-229
    • /
    • 2020
  • The interest in machine learning is growing in all industries, but it is difficult to apply it to real-world tasks because of inexplicability. This paper introduces a case of developing a financial customer churn prediction model for a securities company, and introduces the research results on an attempt to develop a machine learning model that can be explained using the SHAP Value methodology and derivation of interpretability. In this study, a total of six customer churn models are compared and analyzed, and the cause of customer churn is inferred through the classification and data analysis of SHAP Value and the type of customer asset change. Based on the results of this study, it would be possible to use it as a basis for comprehensive judgment, such as using the Value of the deviation prediction result that can infer the cause of the marketing manager's actual customer marketing in the future and establishing a target marketing strategy for each customer.

Comparative Study of Dimension Reduction Methods for Highly Imbalanced Overlapping Churn Data

  • Lee, Sujee;Koo, Bonhyo;Jung, Kyu-Hwan
    • Industrial Engineering and Management Systems
    • /
    • v.13 no.4
    • /
    • pp.454-462
    • /
    • 2014
  • Retention of possible churning customer is one of the most important issues in customer relationship management, so companies try to predict churn customers using their large-scale high-dimensional data. This study focuses on dealing with large data sets by reducing the dimensionality. By using six different dimension reduction methods-Principal Component Analysis (PCA), factor analysis (FA), locally linear embedding (LLE), local tangent space alignment (LTSA), locally preserving projections (LPP), and deep auto-encoder-our experiments apply each dimension reduction method to the training data, build a classification model using the mapped data and then measure the performance using hit rate to compare the dimension reduction methods. In the result, PCA shows good performance despite its simplicity, and the deep auto-encoder gives the best overall performance. These results can be explained by the characteristics of the churn prediction data that is highly correlated and overlapped over the classes. We also proposed a simple out-of-sample extension method for the nonlinear dimension reduction methods, LLE and LTSA, utilizing the characteristic of the data.

Using Image Visualization Based Malware Detection Techniques for Customer Churn Prediction in Online Games (악성코드의 이미지 시각화 탐지 기법을 적용한 온라인 게임상에서의 이탈 유저 탐지 모델)

  • Yim, Ha-bin;Kim, Huy-kang;Kim, Seung-joo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.6
    • /
    • pp.1431-1439
    • /
    • 2017
  • In the security field, log analysis is important to detect malware or abnormal behavior. Recently, image visualization techniques for malware dectection becomes to a major part of security. These techniques can also be used in online games. Users can leave a game when they felt bad experience from game bot, automatic hunting programs, malicious code, etc. This churning can damage online game's profit and longevity of service if game operators cannot detect this kind of events in time. In this paper, we propose a new technique of PNG image conversion based churn prediction to improve the efficiency of data analysis for the first. By using this log compression technique, we can reduce the size of log files by 52,849 times smaller and increase the analysis speed without features analysis. Second, we apply data mining technique to predict user's churn with a real dataset from Blade & Soul developed by NCSoft. As a result, we can identify potential churners with a high accuracy of 97%.

Development of churn prediction model in a newspaper based on real case (사례를 기반으로 한 신문 산업에서의 고객 이탈 예측 모형 구축)

  • Yang, Seung-Jeong;Rhee, Jong Tae
    • Journal of the Korea Safety Management & Science
    • /
    • v.9 no.3
    • /
    • pp.111-118
    • /
    • 2007
  • What is CRM(Customer Relationship Management) means that planning, executing, and re-accessing the marketing strategy based on the customer character by analyzing the material related to customers. That is CRM is a strategy of customer service on the base of data. In the case of the telecommunications and a newspaper, there are restricted application of CRM, because they are provided services by paying a given amount of money within a given period of time. This paper develops CRM model(chum prediction model) that can apply to a newspaper. For model-building, real data were used which were collected from one of the major a newspaper company in Korea. Also, this paper verifies the efficient result.

A Study on the Analysis of Comparison of Churn Prediction Models in Mobile Telecommunication Services (이동통신서비스 해지고객 예측모형의 비교 분석에 관한 연구)

  • Kim, Choong-Nyoung;Chang, Nam-Sik;Kim, Jun-Woo
    • Asia pacific journal of information systems
    • /
    • v.12 no.1
    • /
    • pp.139-158
    • /
    • 2002
  • As the telecommunication market becomes mature in Korea, severe competition has already begun on the market. While service providers struggled for the last couple of years to acquire as many new customers as possible, nowadays they are making more efforts on retaining the current customers. The churn management by analyzing customers' demographic and transactional data becomes one of the key customer retention strategies which most companies pursue. However, the customer data analysis has still remained at the basic level in the industry, even though it has considerable potential as a tool for understanding customer behavior. This paper develops several churn prediction models using data mining techniques such as logistic regression, decision trees, and neural networks. For model-building, real data were used which were collected from one of the major telecommunication companies in Korea. This paper explores various ways of comparing model performance, while the hit ratio was mainly focused in the previous research. The comparison criteria used in this study include gain ratio, Kolmogorov-Smirnov statistics, distribution of the predicted values, and explanation ability. This paper also suggest some guidance for model selection in applying data mining techniques.

Analyzing Customer Management Data by Data Mining: Case Study on Chum Prediction Models for Insurance Company in Korea

  • Cho, Mee-Hye;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1007-1018
    • /
    • 2008
  • The purpose of this case study is to demonstrate database-marketing management. First, we explore original variables for insurance customer's data, modify them if necessary, and go through variable selection process before analysis. Then, we develop churn prediction models using logistic regression, neural network and SVM analysis. We also compare these three data mining models in terms of misclassification rate.

  • PDF

A Hybrid SVM Classifier for Imbalanced Data Sets (불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델)

  • Lee, Jae Sik;Kwon, Jong Gu
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.125-140
    • /
    • 2013
  • We call a data set in which the number of records belonging to a certain class far outnumbers the number of records belonging to the other class, 'imbalanced data set'. Most of the classification techniques perform poorly on imbalanced data sets. When we evaluate the performance of a certain classification technique, we need to measure not only 'accuracy' but also 'sensitivity' and 'specificity'. In a customer churn prediction problem, 'retention' records account for the majority class, and 'churn' records account for the minority class. Sensitivity measures the proportion of actual retentions which are correctly identified as such. Specificity measures the proportion of churns which are correctly identified as such. The poor performance of the classification techniques on imbalanced data sets is due to the low value of specificity. Many previous researches on imbalanced data sets employed 'oversampling' technique where members of the minority class are sampled more than those of the majority class in order to make a relatively balanced data set. When a classification model is constructed using this oversampled balanced data set, specificity can be improved but sensitivity will be decreased. In this research, we developed a hybrid model of support vector machine (SVM), artificial neural network (ANN) and decision tree, that improves specificity while maintaining sensitivity. We named this hybrid model 'hybrid SVM model.' The process of construction and prediction of our hybrid SVM model is as follows. By oversampling from the original imbalanced data set, a balanced data set is prepared. SVM_I model and ANN_I model are constructed using the imbalanced data set, and SVM_B model is constructed using the balanced data set. SVM_I model is superior in sensitivity and SVM_B model is superior in specificity. For a record on which both SVM_I model and SVM_B model make the same prediction, that prediction becomes the final solution. If they make different prediction, the final solution is determined by the discrimination rules obtained by ANN and decision tree. For a record on which SVM_I model and SVM_B model make different predictions, a decision tree model is constructed using ANN_I output value as input and actual retention or churn as target. We obtained the following two discrimination rules: 'IF ANN_I output value <0.285, THEN Final Solution = Retention' and 'IF ANN_I output value ${\geq}0.285$, THEN Final Solution = Churn.' The threshold 0.285 is the value optimized for the data used in this research. The result we present in this research is the structure or framework of our hybrid SVM model, not a specific threshold value such as 0.285. Therefore, the threshold value in the above discrimination rules can be changed to any value depending on the data. In order to evaluate the performance of our hybrid SVM model, we used the 'churn data set' in UCI Machine Learning Repository, that consists of 85% retention customers and 15% churn customers. Accuracy of the hybrid SVM model is 91.08% that is better than that of SVM_I model or SVM_B model. The points worth noticing here are its sensitivity, 95.02%, and specificity, 69.24%. The sensitivity of SVM_I model is 94.65%, and the specificity of SVM_B model is 67.00%. Therefore the hybrid SVM model developed in this research improves the specificity of SVM_B model while maintaining the sensitivity of SVM_I model.