• Title/Summary/Keyword: tree classification method

Search Result 361, Processing Time 0.033 seconds

CHAID Algorithm by Cube-based Proportional Sampling

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.04a
    • /
    • pp.39-50
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. CHAID(Chi-square Automatic Interaction Detector) uses the chi-squired statistic to determine splitting and is an exploratory method used to study the relationship between a dependent variable and a series of predictor variables. In this paper we propose CHAID algorithm by cube-based proportional sampling and explore CHAID algorithm in view of accuracy and speed by the number of variables.

  • PDF

CHAID Algorithm by Cube-based Proportional Sampling

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.803-816
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. CHAID uses the chi-squired statistic to determine splitting and is an exploratory method used to study the relationship between a dependent variable and a series of predictor variables. In this paper we propose CHAID algorithm by cube-based proportional sampling and explore CHAID algorithm in view of accuracy and speed by the number of variables.

  • PDF

On the Performance Analysis of an Automatic Neural Network Signal Classifier (신경회로망을 이용한 신호 자동식별기 구현 및 성능분석)

  • Yoon, Byung-Soo;Yang, Seong-Chul;Nam, Sang-Won;Oh, Won-Tcheon
    • Proceedings of the KIEE Conference
    • /
    • 1994.11a
    • /
    • pp.397-399
    • /
    • 1994
  • In this paper a feature-based automatic neural network signal classifier is presented, where five neural network algorithms such as MLP, RBF, LVQ2, MLP-Tree and LVQ-Tree are combined in parallel to classifiy various signals from their features, based on the majority vote method. To demonstrate the performance and applicability of the proposed signal classifier, some test results for the classification of synthetic waveforms and power disturbances are provided.

  • PDF

Environmental Gradient Analysis of Forest Vegetation of Mt. Naejang, Southwestern Korea (내장산 삼림식생의 환경경도분석)

  • 김정언
    • Journal of Plant Biology
    • /
    • v.31 no.1
    • /
    • pp.33-39
    • /
    • 1988
  • The environmental gradient analyses were aplied for the ordination of forest vegetation in Mt. Naejang national park area in Korea. The species population sequence along soil moisture gradient, mesic to xeric, was shown in following order: Zelkova serrata, Celtis sinensis, Lindera erythrocarpa, Cornus controversa, Acer mono, Carpinus tschonoskii, Quercus aliena, Daphniphyllum macropodum, Torreya mucifera, Carpinus laxiflora, Quercus serrata, Quercus variabilis, Quercus mongolica and Pinus densiflora in tree species and Acer pseudo-siebolidianum var. koreanum, Lindera obtusiloba, Styrax obassia, Styrax japonica, Acer pseudo-sieboldianum and Rhododendron schlippenbachii in shrub species. Ten ecological groups of tree were grouped and coincided with the vegetational units in phytosociological classification by Z-M method, associations. Four vegetation types, cove forest with Zelkova serrata and Lindera erythrocarpa, hornbeam forest with Carpinum laxiflora and Carpinum tschonoskii, oak forest with Quercus variabilis and Quercus mongolica and pine forest with Pinus densiflora as the dominant species were separated in mosaic chart by the two dimensional analyses of elevation and soil moisture gradient.

  • PDF

An early fouling alarm method for a ceramic microfiltration pilot plant using machine learning (머신러닝을 활용한 세라믹 정밀여과 파일럿 플랜트의 파울링 조기 경보 방법)

  • Dohyun Tak;Dongkeon Kim;Jongmin Jeon;Suhan Kim
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.5
    • /
    • pp.271-279
    • /
    • 2023
  • Fouling is an inevitable problem in membrane water treatment plant. It can be measured by trans-membrane pressure (TMP) in the constant flux operation, and chemical cleaning is carried out when TMP reaches a critical value. An early fouilng alarm is defined as warning the critical TMP value appearance in advance. The alarming method was developed using one of machine learning algorithms, decision tree, and applied to a ceramic microfiltration (MF) pilot plant. First, the decision tree model that classifies the normal/abnormal state of the filtration cycle of the ceramic MF pilot plant was developed and it was then used to make the early fouling alarm method. The accuracy of the classification model was up to 96.2% and the time for the early warning was when abnormal cycles occurred three times in a row. The early fouling alram can expect reaching a limit TMP in advance (e.g., 15-174 hours). By adopting TMP increasing rate and backwash efficiency as machine learning variables, the model accuracy and the reliability of the early fouling alarm method were increased, respectively.

Automatic ADL Classification Using 3 Axial Accelerometers and RFID Sensor (3차원 가속 센서 및 RFID 센서를 이용한 ADL 자동 분류)

  • Im, Sae-Mi;Kim, Ig-Jae;Ahn, Sang-Chul;Kim, Hyoung-Gon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.135-141
    • /
    • 2008
  • We propose a new method for recognizing the activities of daily living(ADL) based on the state-dependent motion analysis using 3-axial accelerometers and a glove type RFID reader. Two accelerometers are used for the classification of 5 body states based on the decision tree. Classification of the instrumental activities is performed based on the hand interaction with an object ID using an accelerometer and a RFID reader. Object-dependent hand movements are classified into 5 categories in advance and final decision combines the body state and the instrumental activities. Experiment shows that the suggested hierarchical motion analysis provides accuracy rate of over 90% for all 20 ADLs.

Dynamic recomposition of document category using user intention tree (사용자 의도 트리를 사용한 동적 카테고리 재구성)

  • Kim, Hyo-Lae;Jang, Young-Cheol;Lee, Chang-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.8B no.6
    • /
    • pp.657-668
    • /
    • 2001
  • It is difficult that web documents are classified with exact user intention because existing document classification systems are based on word frequency number using single keyword. To improve this defect, first, we use keyword, a query, domain knowledge. Like explanation based learning, first, query is analyzed with knowledge based information and then structured user intention information is extracted. We use this intention tree in the course of existing word frequency number based document classification as user information and constraints. Thus, we can classify web documents with more exact user intention. In classifying document, structured user intention information is helpful to keep more documents and information which can be lost in the system using single keyword information. Our hybrid approach integrating user intention information with existing statistics and probability method is more efficient to decide direction and range of document category than existing word frequency approach.

  • PDF

Case Study of CRM Application Using Improvement Method of Fuzzy Decision Tree Analysis (퍼지의사결정나무 개선방법을 이용한 CRM 적용 사례)

  • Yang, Seung-Jeong;Rhee, Jong-Tae
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.8
    • /
    • pp.13-20
    • /
    • 2007
  • Decision tree is one of the most useful analysis methods for various data mining functions, including prediction, classification, etc, from massive data. Decision tree grows by splitting nodes, during which the purity increases. It is needed to stop splitting nodes when the purity does not increase effectively or new leaves does not contain meaningful number of records. Pruning is done if a branch does not show certain level of performance. By pruning, the structure of decision tree is changed and it is implied that the previous splitting of the parent node was not effective. It is also implied that the splitting of the ancestor nodes were not effective and the choices of attributes and criteria in splitting them were not successful. It should be noticed that new attributes or criteria might be selected to split such nodes for better tries. In this paper, we suggest a procedure to modify decision tree by Fuzzy theory and splitting as an integrated approach.

Movie Popularity Classification Based on Support Vector Machine Combined with Social Network Analysis

  • Dorjmaa, Tserendulam;Shin, Taeksoo
    • Journal of Information Technology Services
    • /
    • v.16 no.3
    • /
    • pp.167-183
    • /
    • 2017
  • The rapid growth of information technology and mobile service platforms, i.e., internet, google, and facebook, etc. has led the abundance of data. Due to this environment, the world is now facing a revolution in the process that data is searched, collected, stored, and shared. Abundance of data gives us several opportunities to knowledge discovery and data mining techniques. In recent years, data mining methods as a solution to discovery and extraction of available knowledge in database has been more popular in e-commerce service fields such as, in particular, movie recommendation. However, most of the classification approaches for predicting the movie popularity have used only several types of information of the movie such as actor, director, rating score, language and countries etc. In this study, we propose a classification-based support vector machine (SVM) model for predicting the movie popularity based on movie's genre data and social network data. Social network analysis (SNA) is used for improving the classification accuracy. This study builds the movies' network (one mode network) based on initial data which is a two mode network as user-to-movie network. For the proposed method we computed degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality as centrality measures in movie's network. Those four centrality values and movies' genre data were used to classify the movie popularity in this study. The logistic regression, neural network, $na{\ddot{i}}ve$ Bayes classifier, and decision tree as benchmarking models for movie popularity classification were also used for comparison with the performance of our proposed model. To assess the classifier's performance accuracy this study used MovieLens data as an open database. Our empirical results indicate that our proposed model with movie's genre and centrality data has by approximately 0% higher accuracy than other classification models with only movie's genre data. The implications of our results show that our proposed model can be used for improving movie popularity classification accuracy.

Multiple SVM Classifier for Pattern Classification in Data Mining (데이터 마이닝에서 패턴 분류를 위한 다중 SVM 분류기)

  • Kim Man-Sun;Lee Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.289-293
    • /
    • 2005
  • Pattern classification extracts various types of pattern information expressing objects in the real world and decides their class. The top priority of pattern classification technologies is to improve the performance of classification and, for this, many researches have tried various approaches for the last 40 years. Classification methods used in pattern classification include base classifier based on the probabilistic inference of patterns, decision tree, method based on distance function, neural network and clustering but they are not efficient in analyzing a large amount of multi-dimensional data. Thus, there are active researches on multiple classifier systems, which improve the performance of classification by combining problems using a number of mutually compensatory classifiers. The present study identifies problems in previous researches on multiple SVM classifiers, and proposes BORSE, a model that, based on 1:M policy in order to expand SVM to a multiple class classifier, regards each SVM output as a signal with non-linear pattern, trains the neural network for the pattern and combine the final results of classification performance.