• Title/Summary/Keyword: CHAID Analysis

Search Result 46, Processing Time 0.019 seconds

Selecting variables for evidence-diagnosis of paralysis disease using CHAID algorithm

  • Shin, Yan-Kyu
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.76-78
    • /
    • 2001
  • Variable selection in oriental medical research is considered. Decision tree analysis algorithms such as CHAID, CART, C4.5 and QUEST have been successfully applied to a medical research. Paralysis disease is a highly dangerous and murderous disease which accompanied with a great deal of severe physical handicap. In this paper, we explore the use of CHAID algorithm for selecting variables for evidence-diagnosis of paralysis, disease. Empirical results comparing our proposed method to the method using Wilks $\lambda$ given.

  • PDF

A Feature Analysis of Industrial Accidents Using CHAID Algorithm (CHAID 알고리즘을 이용한 산업재해 특성분석)

  • Leem Young-Moon;Hwang Young-Seob
    • Journal of the Korea Safety Management & Science
    • /
    • v.7 no.5
    • /
    • pp.59-67
    • /
    • 2005
  • The main objective of the statistical analysis about industrial accidents is to find out what is the dangerous factor in its own industrial field so that it is possible to prevent or decrease the number of the possible accidents by educating those who work in the fields for safety tools. However, so far, there is no technique of quantitative evaluation on danger. Almost all previous researches as to industrial accidents have only relied on the frequency analysis such as the analysis of the constituent ratio on accidents. As an application of data mining technique, this paper presents analysis on the efficiency of the CHAID algorithm to classify types of industrial accidents data and thereby identifies potential weak points in accident risk grouping.

Development of Selection Model of Interchange Influence Area in Seoul Belt Expressway Using Chi-square Automatic Interaction Detection (CHAID) (CHAID분석을 이용한 나들목 주변 지가의 공간분포 영향모형 개발 - 서울외곽순환고속도로를 중심으로 -)

  • Kim, Tae Ho;Park, Je Jin;Kim, Young Il;Rho, Jeong Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.6D
    • /
    • pp.711-717
    • /
    • 2009
  • This study develops model for analysis of relationship between major node (Interchange in expressway) and land price formation of apartments along with Seoul Belt Expressway by using CHAID analysis. The results show that first, regions(outer side: Gyeongido, inner side: Seoul) on the line of Seoul Belt Expressway are different and a graph generally show llinear relationships between land price and traffic node but it does not; second, CHAID analysis shows two different spatial distribution at the point of 2.6km in the outer side, but three different spatial distribution at the point of 1.4km and 3.8km in the inner side. In other words, traffic access does not necessarily guarantee high housing price since the graphs shows land price related to composite spatial distribution. This implies that residential environments (highway noise and regional discontinuity) and traffic accessibility cause mutual interaction to generate this phenomenon. Therefore, the highway IC landprice model will be beneficial for calculation of land price in New Town which constantly is being built along the highway.

An introductory study on the urban functions using CHAID technique (CHAID 技法에 의한 都市機能의 試論的 硏究)

  • ;Yang, Soon-Jeong
    • Journal of the Korean Geographical Society
    • /
    • v.29 no.3
    • /
    • pp.360-368
    • /
    • 1994
  • To this day, a number of quantitative analytical methods have been employed in clarifying regional characteristics in the discipline of geography. This paper attempted, as a part of application of those quantitative analyses, to make clear the urban functions and consequently the urban characteristics statistically by adopting newly-introduced CHAID, a sort of discriminant analyis technique. The processing of data was sonducted in two phases. To begin with, the urban functions were classified after designating twenty cities - the population of each city counting 250, 000 or more - as predictor variable, and at the same time four major urban functions like administration, marketing, finance and production as response variable. And then, preeminent functions of individual region were discriminated and concurrently classified by treating the remaining traffic, education, medicare, culture and transportation functions as predictor variable, and the following five regions as response variable: Metropolitan Seoul Area. Pusan region, Taegu region, Kwangju region and Chungcheong region. According to the result of this analysis, marketing and administration are emereed as meaningful functions in Seoul and Taegu respectively. As for the finance function only Pusan and Pucheon can be discriminated. Seoul, Pusan and Seongnam reveal their dominancy in production function. To take a look at the result of the latter analysis, the Metropolitan Seoul area shows, among other functions, strong traffic and finance functions. When it comes fo Pusan region, adminstration, education and finance functions are recorded as a leading ones, and Taegu region is preferable in education, medicare and transportation functions. In case of Kwangju region adminstration, production and education functions are discriminated from any other functions. Chungcheong region shows similar aspect with only traffic function replacing the production function of Kwangju region. Based on aforementioned anlysis, it can be said that the CHAID technique, which is capable of processing large amount of categorical data and, by presenting its outcome in the form of dendrogram, facilitates the interpretation work, is an effective, meaningful means to classify and discriminate certain geographical regions and their characteristics.

  • PDF

Data Analysis of Industrial Accidents in Manufacturing Industries Using CHIAD Algorithm (CHAID Algorithm을 이용한 제조업에서의 산업재해 데이터 분석)

  • Leem Young-Moon;Hwang Young-Seob
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2006.04a
    • /
    • pp.45-50
    • /
    • 2006
  • The main objective of this study is to provide feature analysis of industrial accidents in manufacturing industries using CHAID algorithm. In this study, data on 10,536 accidents were analyed to create risk groups, Including the risk of disease and accident. The sample for this work chosen from data related to manufacturing industries during three years $(2002\sim2004)$ in Korea. The resulting classification rules have been incorporated into development of a developed database tool to help quantify associated risks and act as an early warning system to individual industrial accident in manufacturing industries.

  • PDF

A Study on Travel Pattern Analysis and Political Application using Transportation Card Data: In Gyeonggi-Do Case (교통카드자료를 이용한 통행패턴분석과 정책활용방안 연구 -경기도를 중심으로-)

  • Bin, Miyoung;Moon, Juback;Joh, Chang-Hyeon
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.615-627
    • /
    • 2012
  • This study analyzed the travel pattern with respect to use of public transportation by using transportation card data and presented the measures that can be used in a traffic policy. Transportation card data targeted Gyeonggi-Do area and as a utilization plan, a scenario that when a traffic policy decision maker improves bus stop facilities, the person selects a target site by using several variables that can be obtained from transportation card data was set and analyzed. The analysis result showed that K means cluster analysis which is decision making methodology and CHAID(Chi-squared automatic interaction detection) were used and it can be used usefully in policies in significance level of p <0.01. Also, based on these results, this study presented policy implications to be improved to actually use transportation card data in policies.

  • PDF

Development of Selection Model of Subway Station Influence Area (SIA) in Seoul City using Chi-square Automatic Interaction Detection (CHAID) (CHAID분석을 이용한 서울시 지하철 역세권 지가 영향모형 개발)

  • Choi, Yu-Ran;Kim, Tae-Ho;Park, Jung-Soo
    • Journal of the Korean Society for Railway
    • /
    • v.11 no.5
    • /
    • pp.504-512
    • /
    • 2008
  • In general, based on criteria of subway law, radius 500m from subway station is defined as SIA (Subway Station Influence Area). Therefore, in this paper, selection models of SIA are developed to identify appropriate SIA for specific legions in Seoul metropolitan city based on CHAID analysis. As a result, following outputs are obtained; (1) walking distance from subway station is the most influential factor to define SIA (2) SIAs vary with regions (i. e. Gangnam area: 767m, Gangbuk area: 452m), and (3) walking distance from subway station is influential to land price of SIA. In addition, in Gangnam, the structure of land price of the closest section has a polynomial trend curve rather than linear compared in comparison with other sections. Therefore, it is desirable for current definition of SIA (radius 500m from subway station) to be redefined to reflect characteristics of land use and walking distance according to each region respectively.

A Combinatorial Optimization for Influential Factor Analysis: a Case Study of Political Preference in Korea

  • Yun, Sung Bum;Yoon, Sanghyun;Heo, Joon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.35 no.5
    • /
    • pp.415-422
    • /
    • 2017
  • Finding influential factors from given clustering result is a typical data science problem. Genetic Algorithm based method is proposed to derive influential factors and its performance is compared with two conventional methods, Classification and Regression Tree (CART) and Chi-Squared Automatic Interaction Detection (CHAID), by using Dunn's index measure. To extract the influential factors of preference towards political parties in South Korea, the vote result of $18^{th}$ presidential election and 'Demographic', 'Health and Welfare', 'Economic' and 'Business' related data were used. Based on the analysis, reverse engineering was implemented. Implementation of reverse engineering based approach for influential factor analysis can provide new set of influential variables which can present new insight towards the data mining field.

A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis

  • Tang, Tzung-I;Zheng, Gang;Huang, Yalou;Shu, Guangfu;Wang, Pengtao
    • Industrial Engineering and Management Systems
    • /
    • v.4 no.1
    • /
    • pp.102-108
    • /
    • 2005
  • This paper studies medical data classification methods, comparing decision tree and system reconstruction analysis as applied to heart disease medical data mining. The data we study is collected from patients with coronary heart disease. It has 1,723 records of 71 attributes each. We use the system-reconstruction method to weight it. We use decision tree algorithms, such as induction of decision trees (ID3), classification and regression tree (C4.5), classification and regression tree (CART), Chi-square automatic interaction detector (CHAID), and exhausted CHAID. We use the results to compare the correction rate, leaf number, and tree depth of different decision-tree algorithms. According to the experiments, we know that weighted data can improve the correction rate of coronary heart disease data but has little effect on the tree depth and leaf number.

A Neural Network for Prediction and Sensitivity of Outpatients' Satisfaction (신경망모형을 이용한 외래환자 만족도예측 및 민감도분석)

  • Lee, Kyun-Jick;Chung, Young-Chul;Kim, Mi-Ra
    • Korea Journal of Hospital Management
    • /
    • v.8 no.1
    • /
    • pp.81-94
    • /
    • 2003
  • This paper aims at developing a prediction model and analyzing a sensitivity for the outpatient's overall satisfaction on utilizing hospital services by using data mining techniques within the context of customer satisfaction. From a total of 900 outpatient cases, 80 percent were randomly selected as the training group and the other 20 percent as the validation group. Cases in the training group were used in the development of the CHAID and Neural Networks. The validation group was used to test the performance of these models. The major findings may be summarized as follows: the CHAID provided six useful predictors - satisfaction with treatment level, satisfaction with healthcare facilities and equipments, satisfaction with registration service, awareness of hospital reputation, satisfaction with staffs courtesy and responsiveness, and satisfaction with nurses kindness. The prediction accuracy rates based on MLP (77.90%) is superior to RBF (76.80%).

  • PDF