• Title/Summary/Keyword: analytic sets

Search Result 62, Processing Time 0.02 seconds

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

A Reflectance Normalization Via BRDF Model for the Korean Vegetation using MODIS 250m Data (한반도 식생에 대한 MODIS 250m 자료의 BRDF 효과에 대한 반사도 정규화)

  • Yeom, Jong-Min;Han, Kyung-Soo;Kim, Young-Seup
    • Korean Journal of Remote Sensing
    • /
    • v.21 no.6
    • /
    • pp.445-456
    • /
    • 2005
  • The land surface parameters should be determined with sufficient accuracy, because these play an important role in climate change near the ground. As the surface reflectance presents strong anisotropy, off-nadir viewing results a strong dependency of observations on the Sun - target - sensor geometry. They contribute to the random noise which is produced by surface angular effects. The principal objective of the study is to provide a database of accurate surface reflectance eliminated the angular effects from MODIS 250m reflective channel data over Korea. The MODIS (Moderate Resolution Imaging Spectroradiometer) sensor has provided visible and near infrared channel reflectance at 250m resolution on a daily basis. The successive analytic processing steps were firstly performed on a per-pixel basis to remove cloudy pixels. And for the geometric distortion, the correction process were performed by the nearest neighbor resampling using 2nd-order polynomial obtained from the geolocation information of MODIS Data set. In order to correct the surface anisotropy effects, this paper attempted the semiempirical kernel-driven Bi- directional Reflectance Distribution Function(BRDF) model. The algorithm yields an inversion of the kernel-driven model to the angular components, such as viewing zenith angle, solar zenith angle, viewing azimuth angle, solar azimuth angle from reflectance observed by satellite. First we consider sets of the model observations comprised with a 31-day period to perform the BRDF model. In the next step, Nadir view reflectance normalization is carried out through the modification of the angular components, separated by BRDF model for each spectral band and each pixel. Modeled reflectance values show a good agreement with measured reflectance values and their RMSE(Root Mean Square Error) was totally about 0.01(maximum=0.03). Finally, we provide a normalized surface reflectance database consisted of 36 images for 2001 over Korea.