• Title/Summary/Keyword: 빅데이터분석기법

Search Result 594, Processing Time 0.023 seconds

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

A Robust Object Detection and Tracking Method using RGB-D Model (RGB-D 모델을 이용한 강건한 객체 탐지 및 추적 방법)

  • Park, Seohee;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.18 no.4
    • /
    • pp.61-67
    • /
    • 2017
  • Recently, CCTV has been combined with areas such as big data, artificial intelligence, and image analysis to detect various abnormal behaviors and to detect and analyze the overall situation of objects such as people. Image analysis research for this intelligent video surveillance function is progressing actively. However, CCTV images using 2D information generally have limitations such as object misrecognition due to lack of topological information. This problem can be solved by adding the depth information of the object created by using two cameras to the image. In this paper, we perform background modeling using Mixture of Gaussian technique and detect whether there are moving objects by segmenting the foreground from the modeled background. In order to perform the depth information-based segmentation using the RGB information-based segmentation results, stereo-based depth maps are generated using two cameras. Next, the RGB-based segmented region is set as a domain for extracting depth information, and depth-based segmentation is performed within the domain. In order to detect the center point of a robustly segmented object and to track the direction, the movement of the object is tracked by applying the CAMShift technique, which is the most basic object tracking method. From the experiments, we prove the efficiency of the proposed object detection and tracking method using the RGB-D model.

A Performance Test of Mobile Cloud Service for Bayesian Image Fusion (베이지안 영상융합을 적용한 모바일 클라우드 성능실험)

  • Kang, Sanggoo;Lee, Kiwon
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.4
    • /
    • pp.445-454
    • /
    • 2014
  • In recent days, trend technologies for cloud, bigdata, or mobile, as the important marketable keywords or paradigm in Information Communication Technology (ICT), are widely used and interrelated each other in the various types of platforms and web-based services. Especially, the combination of cloud and mobile is recognized as one of a profitable business models, holding benefits of their own. Despite these challenging aspects, there are a few application cases of this model dealing with geo-based data sets or imageries. Among many considering points for geo-based cloud application on mobile, this study focused on a performance test of mobile cloud of Bayesian image fusion algorithm with satellite images. Two kinds of cloud platform of Amazon and OpenStack were built for performance test by CPU time stamp. In fact, the scheme for performance test of mobile cloud is not established yet, so experiment conditions applied in this study are to check time stamp. As the result, it is revealed that performance in two platforms is almost same level. It is implied that open source mobile cloud services based on OpenStack are enough to apply further applications dealing with geo-based data sets.

An Iris Detection Algorithm for Disease Prediction based Iridology (홍채학기반이 질병예측을 위한 홍채인식 알고리즘)

  • Cho, Young-bok;Woo, Sung-Hee;Lee, Sang-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.1
    • /
    • pp.107-114
    • /
    • 2017
  • Iris diagnosis is an alternative medicine to diagnose the disease of the patient by using different of the iris pattern, color and other characteristics. This paper proposed a disease prediction algorithm that using the iris regions that analyze iris change to using differential image of iris image. this method utilize as patient's health examination according to iris change. Because most of previous studies only find a sign pattern in a iris image, it's not enough to be used for a iris diagnosis system. We're developed an iris diagnosis system based on a iris images processing approach, It's presents the extraction algorithms of 8 major iris signs and correction manually for improving the accuracy of analysis. As a result, PNSR of applied edge detection image is about 132, and pattern matching area recognition presented practical use possibility by automatic diagnostic that presume situation of human body by iris about 91%.

The Difference Analysis between Maturity Stages of Venture Firms by Classification Techniques of Big Data (빅데이터 분류 기법에 따른 벤처 기업의 성장 단계별 차이 분석)

  • Jung, Byoungho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.197-212
    • /
    • 2019
  • The purpose of this study is to identify the maturity stages of venture firms through classification analysis, which is widely used as a big data technique. Venture companies should develop a competitive advantage in the market. And the maturity stage of a company can be classified into five stages. I will analyze a difference in the growth stage of venture firms between the survey response and the statistical classification methods. The firm growth level distinguished five stages and was divided into the period of start-up and declines. A classification method of big data uses popularly k-mean cluster analysis, hierarchical cluster analysis, artificial neural network, and decision tree analysis. I used variables that asset increase, capital increase, sales increase, operating profit increase, R&D investment increase, operation period and retirement number. The research results, each big data analysis technique showed a large difference of samples sized in the group. In particular, the decision tree and neural networks' methods were classified as three groups rather than five groups. The groups size of all classification analysis was all different by the big data analysis methods. Furthermore, according to the variables' selection and the sample size may be dissimilar results. Also, each classed group showed a number of competitive differences. The research implication is that an analysts need to interpret statistics through management theory in order to interpret classification of big data results correctly. In addition, the choice of classification analysis should be determined by considering not only management theory but also practical experience. Finally, the growth of venture firms needs to be examined by time-series analysis and closely monitored by individual firms. And, future research will need to include significant variables of the company's maturity stages.

ABox Realization Reasoning in Distributed In-Memory System (분산 메모리 환경에서의 ABox 실체화 추론)

  • Lee, Wan-Gon;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.852-859
    • /
    • 2015
  • As the amount of knowledge information significantly increases, a lot of progress has been made in the studies focusing on how to reason large scale ontology effectively at the level of RDFS or OWL. These reasoning methods are divided into TBox classifications and ABox realizations. A TBox classification mainly deals with integrity and dependencies in schema, whereas an ABox realization mainly handles a variety of issues in instances. Therefore, the ABox realization is very important in practical applications. In this paper, we propose a realization method for analyzing the constraint of the specified class, so that the reasoning system automatically infers the classes to which instances belong. Unlike conventional methods that take advantage of the object oriented language based distributed file system, we propose a large scale ontology reasoning method using spark, which is a functional programming-based in-memory system. To verify the effectiveness of the proposed method, we used instances created from the Wine ontology by W3C(120 to 600 million triples). The proposed system processed the largest 600 million triples and generated 951 million triples in 51 minutes (696 K triple / sec) in our largest experiment.

Analysis of Agenda-setting Changes in Alpine Agricultural of Uljin-gun Using Text-Mining - Focusing on the Keywords of Mass-media, Blog·Cafe - (텍스트마이닝 기법을 활용한 울진군 금강송 산지농업 의제설정 변화 - 매스미디어와 블로그·카페 키워드를 중심으로 -)

  • Do, Jee-Yoon;Jeong, Myeong-Cheol
    • Journal of the Korean Institute of Rural Architecture
    • /
    • v.24 no.3
    • /
    • pp.47-57
    • /
    • 2022
  • This study attempted to grasp the status and perception of Uljin Geumgangsong by grasping mass media issues and user perception using big data, and to present basic data when constructing monitoring using user perception by examining the establishment relationship of agenda setting from a time-series perspective. The results of collecting and analyzing text data that can identify mass media and visitor awareness are as follows. First, both mass media and visitor keywords were related to the importance of the value and meaning of Uljin Geumgangsong. Second, in the case of the connection network, Geumgang Pine Agriculture was centered, but in the case of difference in perception between mass media and visitors, such results were derived due to the object of interest. Third, in the case of the connection relationship structure, the connection strength was strong because there were many overlapping contents of mass media. Fourth, as a result of the centrality analysis, both mass media and visitor-aware keywords were positively recognized as spaces created and maintained through institutional support, and objective perception could be grasped by finding hidden keywords. Fifth, as a result of time series analysis, it was possible to grasp the flow through the issue keywords that appeared by period, and unlike the past, it was recognized as a place for tourism and travel. Finally, as a result of examining whether the agenda setting is consistent, there is a mass media influence, so it is thought that more diverse and more information and publicity are needed by utilizing it.

A Study on the Defect Detection of Fabrics using Deep Learning (딥러닝을 이용한 직물의 결함 검출에 관한 연구)

  • Eun Su Nam;Yoon Sung Choi;Choong Kwon Lee
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.92-98
    • /
    • 2022
  • Identifying defects in textiles is a key procedure for quality control. This study attempted to create a model that detects defects by analyzing the images of the fabrics. The models used in the study were deep learning-based VGGNet and ResNet, and the defect detection performance of the two models was compared and evaluated. The accuracy of the VGGNet and the ResNet model was 0.859 and 0.893, respectively, which showed the higher accuracy of the ResNet. In addition, the region of attention of the model was derived by using the Grad-CAM algorithm, an eXplainable Artificial Intelligence (XAI) technique, to find out the location of the region that the deep learning model recognized as a defect in the fabric image. As a result, it was confirmed that the region recognized by the deep learning model as a defect in the fabric was actually defective even with the naked eyes. The results of this study are expected to reduce the time and cost incurred in the fabric production process by utilizing deep learning-based artificial intelligence in the defect detection of the textile industry.

Analysis of the Effects of E-commerce User Ratings and Review Helfulness on Performance Improvement of Product Recommender System (E-커머스 사용자의 평점과 리뷰 유용성이 상품 추천 시스템의 성능 향상에 미치는 영향 분석)

  • FAN, LIU;Lee, Byunghyun;Choi, Ilyoung;Jeong, Jaeho;Kim, Jaekyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.311-328
    • /
    • 2022
  • Because of the spread of smartphones due to the development of information and communication technology, online shopping mall services can be used on computers and mobile devices. As a result, the number of users using the online shopping mall service increases rapidly, and the types of products traded are also growing. Therefore, to maximize profits, companies need to provide information that may interest users. To this end, the recommendation system presents necessary information or products to the user based on the user's past behavioral data or behavioral purchase records. Representative overseas companies that currently provide recommendation services include Netflix, Amazon, and YouTube. These companies support users' purchase decisions by recommending products to users using ratings, purchase records, and clickstream data that users give to the items. In addition, users refer to the ratings left by other users about the product before buying a product. Most users tend to provide ratings only to products they are satisfied with, and the higher the rating, the higher the purchase intention. And recently, e-commerce sites have provided users with the ability to vote on whether product reviews are helpful. Through this, the user makes a purchase decision by referring to reviews and ratings of products judged to be beneficial. Therefore, in this study, the correlation between the product rating and the helpful information of the review is identified. The valuable data of the evaluation is reflected in the recommendation system to check the recommendation performance. In addition, we want to compare the results of skipping all the ratings in the traditional collaborative filtering technique with the recommended performance results that reflect only the 4 and 5 ratings. For this purpose, electronic product data collected from Amazon was used in this study, and the experimental results confirmed a correlation between ratings and review usefulness information. In addition, as a result of comparing the recommendation performance by reflecting all the ratings and only the 4 and 5 points in the recommendation system, the recommendation performance of remembering only the 4 and 5 points in the recommendation system was higher. In addition, as a result of reflecting review usefulness information in the recommendation system, it was confirmed that the more valuable the review, the higher the recommendation performance. Therefore, these experimental results are expected to improve the performance of personalized recommendation services in the future and provide implications for e-commerce sites.

Methodology for Assessing an Integrated Mobility of the Passenger Passing through Intermodal Transit Center (복합환승역사 통행자 기반 통합 모빌리티 평가 기법 개발)

  • You, So-young;Kim, Kyongtae;Jeong, Eunbi;Lee, Jun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.5
    • /
    • pp.12-28
    • /
    • 2017
  • The core of the transportation service, so-called Mobility 4.0 is the flexibility of the entire mobility and its implementation. By doing so, the most essential element is to build a platform to link a supply and a demand simultaneously. In other word, a comprehensive analytical framework is to be set with a data repository which can be periodically updated. With such circumstances, the entire trip chain including pedestrian movements is required to be thoroughly investigated and constructed at the viewpoint of the intermodal transit station. A few studies, however, have been attempted. In this study, the comprehensive analytical framework with the integrated mobility at intermodal transit station was proposed, which consisted of the three modules; 1) Data Repository Extracting from Smart Card DB, 2) Framework of Analyzing Integrated Mobility, and 3) Interpretation of the Integrated Mobility with GIS information and the other factors. A case study with the seven railway stations (Sadang, Sindorom, Samseong, Gwanghwanoon, Gangnam, Jamsil, Seoul Nat'l Univ. of Education) was conducted. The stations of the case study were clustered by the three groups with the statistical ground, and it is most likely to understand the effect of a variety of factors and a comprehensive data-driven analyses with the entire trip stages.