• Title/Summary/Keyword: model tree technique

Search Result 199, Processing Time 0.033 seconds

A Machine Learning-based Customer Classification Model for Effective Online Free Sample Promotions (온라인 무료 샘플 판촉의 효과적 활용을 위한 기계학습 기반 고객분류예측 모형)

  • Won, Ha-Ram;Kim, Moo-Jeon;Ahn, Hyunchul
    • The Journal of Information Systems
    • /
    • v.27 no.3
    • /
    • pp.63-80
    • /
    • 2018
  • Purpose The purpose of this study is to build a machine learning-based customer classification model to promote customer expansion effect of the free sample promotion. Specifically, the proposed model classifies potential target customers who are expected to purchase the products included in the free sample promotion after receiving the free samples. Design/methodology/approach This study proposes to build a customer classification model for determining customers suitable for providing free samples by using various machine learning techniques such as logistic regression, multiple discriminant analysis, case-based reasoning, decision tree, artificial neural network, and support vector machine. To validate the usefulness of the proposed model, we apply it to a real-world free sample-based target marketing case of a Korean major cosmetic retail company. Findings Experimental results show that a machine learning-based customer classification model presents satisfactory accuracy ranging from 70% to 75%. In particular, support vector machine is found to be the most effective machine learning technique for free sample-based target marketing model. Our study sheds a light on customer relationship management strategies using free sample promotions.

On Constructing NURBS Surface Model from Scattered and Unorganized 3-D Range Data (정렬되지 않은 3차원 거리 데이터로부터의 NURBS 곡면 모델 생성 기법)

  • Park, In-Kyu;Yun, Il-Dong;Lee, Sang-Uk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.37 no.3
    • /
    • pp.17-30
    • /
    • 2000
  • In this paper, we propose an efficient algorithm to produce 3-D surface model from a set of range data, based on NURBS (Non-Uniform Rational B-Splines) surface fitting technique. It is assumed that the range data is initially unorganized and scattered 3-D points, while their connectivity is also unknown. The proposed algorithm consists of three steps: initial model approximation, hierarchical representation, and construction of the NURBS patch network. The mitral model is approximated by polyhedral and triangular model using K-means clustering technique Then, the initial model is represented by hierarchically decomposed tree structure. Based on this, $G^1$ continuous NURBS patch network is constructed efficiently. The computational complexity as well as the modeling error is much reduced by means of hierarchical decomposition and precise approximation of the NURBS control mesh Experimental results show that the initial model as well as the NURBS patch network are constructed automatically, while the modeling error is observed to be negligible.

  • PDF

Developing the administrative model using the data mining technique for injury in National Health Insurance (데이터마이닝 기법을 활용한 국민건강보험 상해상병 관리모형 개발)

  • Park, Il-Su;Han, Jun-Tae;Sohn, Hae-Sook;Kang, Suk-Bok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.3
    • /
    • pp.467-476
    • /
    • 2011
  • We developed the hybrid model coupled with predictive model and business rule model for administration of injury by utilizing medical data of the National Health Insurance in Korea. We performed decision tree analysis using data mining methodology and used SAS Enterprise Miner 4.1. We also investigated under several business rule for benefits (expense paid by insurer) and claims of injury in National Health Insurance Corporation. We can see that the proposed hybrid model provides a quite efficient plausible results.

Predicting Reports of Theft in Businesses via Machine Learning

  • JungIn, Seo;JeongHyeon, Chang
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.499-510
    • /
    • 2022
  • This study examines the reporting factors of crime against business in Korea and proposes a corresponding predictive model using machine learning. While many previous studies focused on the individual factors of theft victims, there is a lack of evidence on the reporting factors of crime against a business that serves the public good as opposed to those that protect private property. Therefore, we proposed a crime prevention model for the willingness factor of theft reporting in businesses. This study used data collected through the 2015 Commercial Crime Damage Survey conducted by the Korea Institute for Criminal Policy. It analyzed data from 834 businesses that had experienced theft during a 2016 crime investigation. The data showed a problem with unbalanced classes. To solve this problem, we jointly applied the Synthetic Minority Over Sampling Technique and the Tomek link techniques to the training data. Two prediction models were implemented. One was a statistical model using logistic regression and elastic net. The other involved a support vector machine model, tree-based machine learning models (e.g., random forest, extreme gradient boosting), and a stacking model. As a result, the features of theft price, invasion, and remedy, which are known to have significant effects on reporting theft offences, can be predicted as determinants of such offences in companies. Finally, we verified and compared the proposed predictive models using several popular metrics. Based on our evaluation of the importance of the features used in each model, we suggest a more accurate criterion for predicting var.

Enhancing prediction accuracy of concrete compressive strength using stacking ensemble machine learning

  • Yunpeng Zhao;Dimitrios Goulias;Setare Saremi
    • Computers and Concrete
    • /
    • v.32 no.3
    • /
    • pp.233-246
    • /
    • 2023
  • Accurate prediction of concrete compressive strength can minimize the need for extensive, time-consuming, and costly mixture optimization testing and analysis. This study attempts to enhance the prediction accuracy of compressive strength using stacking ensemble machine learning (ML) with feature engineering techniques. Seven alternative ML models of increasing complexity were implemented and compared, including linear regression, SVM, decision tree, multiple layer perceptron, random forest, Xgboost and Adaboost. To further improve the prediction accuracy, a ML pipeline was proposed in which the feature engineering technique was implemented, and a two-layer stacked model was developed. The k-fold cross-validation approach was employed to optimize model parameters and train the stacked model. The stacked model showed superior performance in predicting concrete compressive strength with a correlation of determination (R2) of 0.985. Feature (i.e., variable) importance was determined to demonstrate how useful the synthetic features are in prediction and provide better interpretability of the data and the model. The methodology in this study promotes a more thorough assessment of alternative ML algorithms and rather than focusing on any single ML model type for concrete compressive strength prediction.

Personalized Service Based on Context Awareness through User Emotional Perception in Mobile Environment (모바일 환경에서의 상황인식 기반 사용자 감성인지를 통한 개인화 서비스)

  • Kwon, Il-Kyoung;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.10 no.2
    • /
    • pp.287-292
    • /
    • 2012
  • In this paper, user personalized services through the emotion perception required to support location-based sensing data preprocessing techniques and emotion data preprocessing techniques is studied for user's emotion data building and preprocessing in V-A emotion model. For this purpose the granular context tree and string matching based emotion pattern matching techniques are used. In addition, context-aware and personalized recommendation services technique using probabilistic reasoning is studied for personalized services based on context awareness.

Business Process Repository for Exception Handling in BPM (예외업무 관리를 위한 비즈니스 프로세스 저장소의 활용)

  • Choi Deok-Won;Sin Jin-Gyu;Jin Jung-Hyeon
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.265-270
    • /
    • 2006
  • In an organization where major business operations are geared by business process management system(BPMS), routine tasks are processed according to the predefined business processes. However, most business operations are subject to some sort of exceptions, and the exceptional situations require update of the existing business process model, or a new business process model has to be defined to handle the exceptions. This paper proposes a system architecture that deploys business process repository as the media for storage and retrieval of the various business process models developed for exception handling. Well defined situation variables and decision variables play the key role for efficient storage and retrieval of the business process models developed for exception handling. The data mining technique C5.0 was used to build the optimum path for the process repository search tree.

  • PDF

Evaluations of AI-based malicious PowerShell detection with feature optimizations

  • Song, Jihyeon;Kim, Jungtae;Choi, Sunoh;Kim, Jonghyun;Kim, Ikkyun
    • ETRI Journal
    • /
    • v.43 no.3
    • /
    • pp.549-560
    • /
    • 2021
  • Cyberattacks are often difficult to identify with traditional signature-based detection, because attackers continually find ways to bypass the detection methods. Therefore, researchers have introduced artificial intelligence (AI) technology for cybersecurity analysis to detect malicious PowerShell scripts. In this paper, we propose a feature optimization technique for AI-based approaches to enhance the accuracy of malicious PowerShell script detection. We statically analyze the PowerShell script and preprocess it with a method based on the tokens and abstract syntax tree (AST) for feature selection. Here, tokens and AST represent the vocabulary and structure of the PowerShell script, respectively. Performance evaluations with optimized features yield detection rates of 98% in both machine learning (ML) and deep learning (DL) experiments. Among them, the ML model with the 3-gram of selected five tokens and the DL model with experiments based on the AST 3-gram deliver the best performance.

A COMPARATIVE ANALYSIS OF THE ACCURACY OF IMPLANT IMPRESSION TECHNIQUES BY USING STRAIN GAUGE (Strain gauge를 사용한 임플랜트 인상법의 정확도 비교)

  • Han, Eu-Taek;Kim, Yung-Soo;Kim, Chang-Whe
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.33 no.3
    • /
    • pp.539-549
    • /
    • 1995
  • The purpose of this study was to determine the accuracy of 3 implant impression methods by using strain gauge. The models used for this study were partially edentulous mandibular acrylic resin casts Model A, with two abutment analogs in #46,47 extraction site, represented two implant parallel to to the adjacent natural tooth. Model B represent an anterior implant parallel to the adjacene natural tooth and a posterior implant exhibiting a 15-degree lingual inclination. Master framework were fabricated on the master model, and 3 strain gauges were attached to a master framwork to determine the passivity of fit of the framework to sample casts made by the three impression techniques. The master framework was attached to each sample cast with gold screws, which were tightened with the torque driver to ensure a consistent toque application of 10 Ncm. Universal Digital Measuring System UCAM-5BT was used for strain measuring. Impression techniques studid were : 1. unsplinted tapered impression coping, polyvinyl siloxane, stock tray 2. unsplinted squared impression coping, polyether, custom tray 3. squared impression coping splinted with Duralay resin, polyether, custom tray Through analysis on data from this study, the following conclusions were obtained. 1. There were no statistically significant differences between the mean strain recorded from the sample casts made with the tree impression. But only strain values of model A(parallel group) Y-axis was signifcantly differed between Technique 1 and 3(P<0.05). 2. There was no statistically significant difference between model A(parallel group) and model B(15-degree divergent group).

  • PDF

Estimation of a Nationwide Statistics of Hernia Operation Applying Data Mining Technique to the National Health Insurance Database (데이터마이닝 기법을 이용한 건강보험공단의 수술 통계량 근사치 추정 -허니아 수술을 중심으로-)

  • Kang, Sung-Hong;Seo, Seok-Kyung;Yang, Yeong-Ja;Lee, Ae-Kyung;Bae, Jong-Myon
    • Journal of Preventive Medicine and Public Health
    • /
    • v.39 no.5
    • /
    • pp.433-437
    • /
    • 2006
  • Objectives: The aim of this study is to develop a methodology for estimating a nationwide statistic for hernia operations with using the claim database of the Korea Health Insurance Cooperation (KHIC). Methods: According to the insurance claim procedures, the claim database was divided into the electronic data interchange database (EDI_DB) and the sheet database (Paper_DB). Although the EDI_DB has operation and management codes showing the facts and kinds of operations, the Paper_DB doesn't. Using the hernia matched management code in the EDI_DB, the cases of hernia surgery were extracted. For drawing the potential cases from the Paper_DB, which doesn't have the code, the predictive model was developed using the data mining technique called SEMMA. The claim sheets of the cases that showed a predictive probability of an operation over the threshold, as was decided by the ROC curve, were identified in order to get the positive predictive value as an index of usefulness for the predictive model. Results: Of the claim databases in 2004, 14,386 cases had hernia related management codes with using the EDI system. For fitting the models with applying the data mining technique, logistic regression was chosen rather than the neural network method or the decision tree method. From the Paper_DB, 1,019 cases were extracted as potential cases. Direct review of the sheets of the extracted cases showed that the positive predictive value was 95.3%. Conclusions: The results suggested that applying the data mining technique to the claim database in the KHIC for estimating the nationwide surgical statistics would be useful from the aspect of execution and cost-effectiveness.