• Title/Summary/Keyword: CHAID Technique

Search Result 16, Processing Time 0.022 seconds

A Feature Analysis of Industrial Accidents Using CHAID Algorithm (CHAID 알고리즘을 이용한 산업재해 특성분석)

  • Leem Young-Moon;Hwang Young-Seob
    • Journal of the Korea Safety Management & Science
    • /
    • v.7 no.5
    • /
    • pp.59-67
    • /
    • 2005
  • The main objective of the statistical analysis about industrial accidents is to find out what is the dangerous factor in its own industrial field so that it is possible to prevent or decrease the number of the possible accidents by educating those who work in the fields for safety tools. However, so far, there is no technique of quantitative evaluation on danger. Almost all previous researches as to industrial accidents have only relied on the frequency analysis such as the analysis of the constituent ratio on accidents. As an application of data mining technique, this paper presents analysis on the efficiency of the CHAID algorithm to classify types of industrial accidents data and thereby identifies potential weak points in accident risk grouping.

An introductory study on the urban functions using CHAID technique (CHAID 技法에 의한 都市機能의 試論的 硏究)

  • ;Yang, Soon-Jeong
    • Journal of the Korean Geographical Society
    • /
    • v.29 no.3
    • /
    • pp.360-368
    • /
    • 1994
  • To this day, a number of quantitative analytical methods have been employed in clarifying regional characteristics in the discipline of geography. This paper attempted, as a part of application of those quantitative analyses, to make clear the urban functions and consequently the urban characteristics statistically by adopting newly-introduced CHAID, a sort of discriminant analyis technique. The processing of data was sonducted in two phases. To begin with, the urban functions were classified after designating twenty cities - the population of each city counting 250, 000 or more - as predictor variable, and at the same time four major urban functions like administration, marketing, finance and production as response variable. And then, preeminent functions of individual region were discriminated and concurrently classified by treating the remaining traffic, education, medicare, culture and transportation functions as predictor variable, and the following five regions as response variable: Metropolitan Seoul Area. Pusan region, Taegu region, Kwangju region and Chungcheong region. According to the result of this analysis, marketing and administration are emereed as meaningful functions in Seoul and Taegu respectively. As for the finance function only Pusan and Pucheon can be discriminated. Seoul, Pusan and Seongnam reveal their dominancy in production function. To take a look at the result of the latter analysis, the Metropolitan Seoul area shows, among other functions, strong traffic and finance functions. When it comes fo Pusan region, adminstration, education and finance functions are recorded as a leading ones, and Taegu region is preferable in education, medicare and transportation functions. In case of Kwangju region adminstration, production and education functions are discriminated from any other functions. Chungcheong region shows similar aspect with only traffic function replacing the production function of Kwangju region. Based on aforementioned anlysis, it can be said that the CHAID technique, which is capable of processing large amount of categorical data and, by presenting its outcome in the form of dendrogram, facilitates the interpretation work, is an effective, meaningful means to classify and discriminate certain geographical regions and their characteristics.

  • PDF

A Study on the Effective Database Marketing using Data Mining Technique(CHAID) (데이터마이닝 기법(CHAID)을 이용한 효과적인 데이터베이스 마케팅에 관한 연구)

  • 김신곤
    • The Journal of Information Technology and Database
    • /
    • v.6 no.1
    • /
    • pp.89-101
    • /
    • 1999
  • Increasing number of companies recognize that the understanding of customers and their markets is indispensable for their survival and business success. The companies are rapidly increasing the amount of investments to develop customer databases which is the basis for the database marketing activities. Database marketing is closely related to data mining. Data mining is the non-trivial extraction of implicit, previously unknown and potentially useful knowledge or patterns from large data. Data mining applied to database marketing can make a great contribution to reinforce the company's competitiveness and sustainable competitive advantages. This paper develops the classification model to select the most responsible customers from the customer databases for telemarketing system and evaluates the performance of the developed model using LIFT measure. The model employs the decision tree algorithm, i.e., CHAID which is one of the well-known data mining techniques. This paper also represents the effective database marketing strategy by applying the data mining technique to a credit card company's telemarketing system.

  • PDF

On the Determination of Outpatient's Revisit using Data Mining (데이터 마이닝을 활용한 병원 재방문도 영향요인 분석 : 외래환자의 만족도를 중심으로)

  • 이견직
    • Health Policy and Management
    • /
    • v.13 no.3
    • /
    • pp.21-34
    • /
    • 2003
  • Patient revisit to used hospital is a key factor in determining a health care organization's competitive advantage and survival. This article examines the relationship between customer's satisfaction and his/her revisit associated with three different methods which are the Chi Square Automatic Interaction Detection(CHAID) for segmenting the outpatient group, logistic regression and neural networks for addressing the outpatient's revisit. The main findings indicate that the important factors on outpatient's revisit are physician's kindness, nurse's skill, overall level of satisfaction, hospital reputation, recommendation, level of diagnoses and outpatient's age. Among these ones, physician's kindness is the most important factor as guidelines for decision of their revisit. The decision maker of hospital should select the strategy containing the variable amount of the level of revisit and size of outpatient's group under the constraint on the hospital's time, budget and manpower given. Finally, this study shows that neural networks, as non-parametric technique, appear to more correctly predict revisit than does logistic regression as a parametric estimation technique.

A Study on Development of A Web-Based Forecasting System of Industrial Accidents (웹 기반의 산업재해 예측시스템 개발에 관한 연구)

  • Leem, Young-Moon;Hwang, Young-Seob;Choi, Yo-Han
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2007.11a
    • /
    • pp.269-274
    • /
    • 2007
  • Ultimate goal of this research is to develop a web-based forecasting system of industrial accidents. As an initial step for the purpose of this study, this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. In addition, this paper presents the logical process for development of a forecasting system. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. The sample for this work was chosen from 10,536 data related to manufacturing industries during three years(2002$^{\sim}$2004) in korea.

  • PDF

SUPPORT Applications for Classification Trees

  • Lee, Sang-Bock;Park, Sun-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.3
    • /
    • pp.565-574
    • /
    • 2004
  • Classification tree algorithms including as CART by Brieman et al.(1984) in some aspects, recursively partition the data space with the aim of making the distribution of the class variable as pure as within each partition and consist of several steps. SUPPORT(smoothed and unsmoothed piecewise-polynomial regression trees) method of Chaudhuri et al(1994), a weighted averaging technique is used to combine piecewise polynomial fits into a smooth one. We focus on applying SUPPORT to a binary class variable. Logistic model is considered in the caculation techniques and the results are shown good classification rates compared with other methods as CART, QUEST, and CHAID.

  • PDF

Analyzing vocational outcomes of people with hearing impairments : A data mining approach (청각장애인의 취업결정요인 분석 연구 -데이터마이닝 기법(Exhaustive CHAID)의 적용)

  • Shin, Hyun-Uk
    • Journal of Digital Convergence
    • /
    • v.13 no.11
    • /
    • pp.449-459
    • /
    • 2015
  • The purpose of this study was to examine demographic, human capital and service factors affecting employment outcomes of people with hearing impairments. The total of 422 individuals (age from 20 years to 65 years) with hearing impairments were collected from the Panel Survey of Employment for the Disabled from Korea Employment Agency for the Disabled. The dependent variable is employment outcomes. The predictor variables include a set of personal history, human capital and rehabilitation service variables. The chi-squared automatic interaction detector (CHAID) analysis revealed that the status of the national basic livelihood security played a determining role in predicting the employment of people with hearing impairments. Also, it was found that the three factors of the status on the national basic livelihood security, needed help about activities of dailey living, licenses & employment service factors created bigger synergy effect when they inter-complemented one another.

Development of an Expert System for Prevention of Industrial Accidents in Manufacturing Industries (제조업에서의 산업재해 예방을 위한 전문가 시스템 개발)

  • Leem Young-Moon;Choi Yo-Han
    • Journal of the Korea Safety Management & Science
    • /
    • v.8 no.1
    • /
    • pp.53-64
    • /
    • 2006
  • Many researches and analyses have been focused on industrial accidents in order to predict and reduce them. As a similar endeavor, this paper is to develop an expert system for prevention of industrial accidents. Although various previous studies have been performed to prevent industrial accidents, these studies only provide managerial and educational policies using frequency analysis and comparative analysis based on data from past industrial accidents. As an initial step for the purpose of this study, this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. Enterprise Miner of SAS and Answer Tree of SPSS will be used to evaluate the validity of the results of the four algorithms. The sample for this work was chosen from 10,536 data related to manufacturing industries during three years$(2002\sim2004)$ in korea. The initial sample includes a range of different businesses including the construction and manufacturing industries, which are typically vulnerable to industrial accidents.

Evaluation on Performance for Classification of Students Leaving Their Majors Using Data Mining Technique (데이터마이닝 기법을 이용한 전공이탈자 분류를 위한 성능평가)

  • Leem, Young-Moon;Ryu, Chang-Hyun
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2006.11a
    • /
    • pp.293-297
    • /
    • 2006
  • Recently most universities are suffering from students leaving their majors. In order to make a countermeasure for reducing major separation rate, many universities are trying to find a proper solution. As a similar endeavor, this paper uses decision tree algorithm which is one of the data mining techniques which conduct grouping or prediction into several sub-groups from interested groups. This technique can analyze a feature of type on students leaving their majors. The dataset consists of 5,115 features through data selection from total data of 13,346 collected from a university in Kangwon-Do during seven years(2000.3.1 $\sim$ 2006.6.30). The main objective of this study is to evaluate performance of algorithms including CHAID, CART and C4.5 for classification of students leaving their majors with ROC Chart, Lift Chart and Gains Chart. Also, this study provides values about accuracy, sensitivity, specificity using classification table. According to the analysis result, CART showed the best performance for classification of students leaving their majors.

  • PDF

Predicting Model of Students Leaving Their Majors Using Data Mining Technique (데이터마이닝 기법을 이용한 전공이탈자 예측모형)

  • Leem, Young-Moon;Ryu, Chang-Hyun
    • Journal of the Korea Safety Management & Science
    • /
    • v.8 no.5
    • /
    • pp.17-25
    • /
    • 2006
  • Nowadays most colleges are confronting with a serious problem because many students have left their majors at the colleges. In order to make a countermeasure for reducing major separation rate, many universities are trying to find a proper solution. As a similar endeavor, the objective of this paper Is to find a predicting model of students leaving their majors. The sample for this study was chosen from a university in Kangwon-Do during seven years(2000.3.1 $\sim$ 2006. 6.30). In this study, the ratio of training sample versus testing sample among partition data was controlled as 50% : 50% for a validation test of data division. Also, this study provides values about accuracy, sensitivity, specificity about three kinds of algorithms including CHAID, CART and C4.5. In addition, ROC chart and gains chart were used for classification of students leaving their majors. The analysis results were very informative since those enable us to know the most important factors such as semester taking a course, grade on cultural subjects, scholarship, grade on majors, and total completion of courses which can affect students leaving their majors.