• Title/Summary/Keyword: 결정트리분류기

Search Result 52, Processing Time 0.026 seconds

A Neural Network-Driven Decision Tree Classifier Approach to Time Series Identification (인공신경망 기초 의사결정트리 분류기에 의한 시계열모형화에 관한 연구)

  • 오상봉
    • Journal of the Korea Society for Simulation
    • /
    • v.5 no.1
    • /
    • pp.1-12
    • /
    • 1996
  • We propose a new approach to classifying a time series data into one of the autoregressive moving-average (ARMA) models. It is bases on two pattern recognition concepts for solving time series identification. The one is an extended sample autocorrelation function (ESACF). The other is a neural network-driven decision tree classifier(NNDTC) in which two pattern recognition techniques are tightly coupled : neural network and decision tree classfier. NNDTc consists of a set of nodes at which neural network-driven decision making is made whether the connecting subtrees should be pruned or not. Therefore, time series identification problem can be stated as solving a set of local decisions at nodes. The decision values of the nodes are provided by neural network functions attached to the corresponding nodes. Experimental results with a set of test data and real time series data show that the proposed approach can efficiently identify the time seires patterns with high precision compared to the previous approaches.

  • PDF

Bayesian Network-Based Analysis on Clinical Data of Infertility Patients (베이지안 망에 기초한 불임환자 임상데이터의 분석)

  • Jung, Yong-Gyu;Kim, In-Cheol
    • The KIPS Transactions:PartB
    • /
    • v.9B no.5
    • /
    • pp.625-634
    • /
    • 2002
  • In this paper, we conducted various experiments with Bayesian networks in order to analyze clinical data of infertility patients. With these experiments, we tried to find out inter-dependencies among important factors playing the key role in clinical pregnancy, and to compare 3 different kinds of Bayesian network classifiers (including NBN, BAN, GBN) in terms of classification performance. As a result of experiments, we found the fact that the most important features playing the key role in clinical pregnancy (Clin) are indication (IND), stimulation, age of female partner (FA), number of ova (ICT), and use of Wallace (ETM), and then discovered inter-dependencies among these features. And we made sure that BAN and GBN, which are more general Bayesian network classifiers permitting inter-dependencies among features, show higher performance than NBN. By comparing Bayesian classifiers based on probabilistic representation and reasoning with other classifiers such as decision trees and k-nearest neighbor methods, we found that the former show higher performance than the latter due to inherent characteristics of clinical domain. finally, we suggested a feature reduction method in which all features except only some ones within Markov blanket of the class node are removed, and investigated by experiments whether such feature reduction can increase the performance of Bayesian classifiers.

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Phonetic Question Set Generation Algorithm (음소 질의어 집합 생성 알고리즘)

  • 김성아;육동석;권오일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2
    • /
    • pp.173-179
    • /
    • 2004
  • Due to the insufficiency of training data in large vocabulary continuous speech recognition, similar context dependent phones can be clustered by decision trees to share the data. When the decision trees are built and used to predict unseen triphones, a phonetic question set is required. The phonetic question set, which contains categories of the phones with similar co-articulation effects, is usually generated by phonetic or linguistic experts. This knowledge-based approach for generating phonetic question set, however, may reduce the homogeneity of the clusters. Moreover, the experts must adjust the question sets whenever the language or the PLU (phone-like unit) of a recognition system is changed. Therefore, we propose a data-driven method to automatically generate phonetic question set. Since the proposed method generates the phone categories using speech data distribution, it is not dependent on the language or the PLU, and may enhance the homogeneity of the clusters. In large vocabulary speech recognition experiments, the proposed algorithm has been found to reduce the error rate by 14.3%.

Performance comparison of machine learning classification methods for decision of disc cutter replacement of shield TBM (쉴드 TBM 디스크 커터 교체 유무 판단을 위한 머신러닝 분류기법 성능 비교)

  • Kim, Yunhee;Hong, Jiyeon;Kim, Bumjoo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.5
    • /
    • pp.575-589
    • /
    • 2020
  • In recent years, Shield TBM construction has been continuously increasing in domestic tunnels. The main excavation tool in the shield TBM construction is a disc cutter which naturally wears during the excavation process and significantly degrades the excavation efficiency. Therefore, it is important to know the appropriate time of the disc cutter replacement. In this study, it is proposed a predictive model that can determine yes/no of disc cutter replacement using machine learning algorithm. To do this, the shield TBM machine data which is highly correlated to the disc cutter wears and the disc cutter replacement from the shield TBM field which is already constructed are used as the input data in the model. Also, the algorithms used in the study were the support vector machine, k-nearest neighbor algorithm, and decision tree algorithm are all classification methods used in machine learning. In order to construct an optimal predictive model and to evaluate the performance of the model, the classification performance evaluation index was compared and analyzed.

A Design Solution for a Railway Switch Monitoring System (분기기 진단 시스템 설계에 관한 연구)

  • Choo, Eun-Sang;Kim, Min-Seong;Yoo, Heung-Yeol;Mo, Choong-Seon;Son, Eui-Sik;Park, Seongguen;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.18 no.5
    • /
    • pp.439-446
    • /
    • 2015
  • The turnout system, which determines the direction of the train, is not only a key system but also a vulnerable system. Failure of this system may lead to a delay of the train or even casualties. In this light, it is necessary to precisely the conditions of the turnout system. Currently, ROADMASTER of Germany is used as a diagnostic system in Korea. However, a new diagnostic system should be developed for optimized operation of the turnout system with maintenance that is suitable for the Korean railway environment. In this paper, a Fault Tree Analysis for the representative faults of the turnout system is conducted and physical quantities, which can be the cause of the fault, are classified according to the component and function. Also, the measuring factors for the monitoring are derived and a decision making theory is suggested. On the basis of the results, we propose a new turnout diagnostic system that can provide more driverse and precise information than the conventional system.

Automatic ADL Classification Using 3 Axial Accelerometers and RFID Sensor (3차원 가속 센서 및 RFID 센서를 이용한 ADL 자동 분류)

  • Im, Sae-Mi;Kim, Ig-Jae;Ahn, Sang-Chul;Kim, Hyoung-Gon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.135-141
    • /
    • 2008
  • We propose a new method for recognizing the activities of daily living(ADL) based on the state-dependent motion analysis using 3-axial accelerometers and a glove type RFID reader. Two accelerometers are used for the classification of 5 body states based on the decision tree. Classification of the instrumental activities is performed based on the hand interaction with an object ID using an accelerometer and a RFID reader. Object-dependent hand movements are classified into 5 categories in advance and final decision combines the body state and the instrumental activities. Experiment shows that the suggested hierarchical motion analysis provides accuracy rate of over 90% for all 20 ADLs.

Application of Hyperspectral Imagery to Decision Tree Classifier for Assessment of Spring Potato (Solanum tuberosum) Damage by Salinity and Drought (초분광 영상을 이용한 의사결정 트리 기반 봄감자(Solanum tuberosum)의 염해 판별)

  • Kang, Kyeong-Suk;Ryu, Chan-Seok;Jang, Si-Hyeong;Kang, Ye-Seong;Jun, Sae-Rom;Park, Jun-Woo;Song, Hye-Young;Lee, Su Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.4
    • /
    • pp.317-326
    • /
    • 2019
  • Salinity which is often detected on reclaimed land is a major detrimental factor to crop growth. It would be advantageous to develop an approach for assessment of salinity and drought damages using a non-destructive method in a large landfills area. The objective of this study was to examine applicability of the decision tree classifier using imagery for classifying for spring potatoes (Solanum tuberosum) damaged by salinity or drought at vegetation growth stages. We focused on comparing the accuracies of OA (Overall accuracy) and KC (Kappa coefficient) between the simple reflectance and the band ratios minimizing the effect on the light unevenness. Spectral merging based on the commercial band width with full width at half maximum (FWHM) such as 10 nm, 25 nm, and 50 nm was also considered to invent the multispectral image sensor. In the case of the classification based on original simple reflectance with 5 nm of FWHM, the selected bands ranged from 3-13 bands with the accuracy of less than 66.7% of OA and 40.8% of KC in all FWHMs. The maximum values of OA and KC values were 78.7% and 57.7%, respectively, with 10 nm of FWHM to classify salinity and drought damages of spring potato. When the classifier was built based on the band ratios, the accuracy was more than 95% of OA and KC regardless of growth stages and FWHMs. If the multispectral image sensor is made with the six bands (the ratios of three bands) with 10 nm of FWHM, it is possible to classify the damaged spring potato by salinity or drought using the reflectance of images with 91.3% of OA and 85.0% of KC.

Spectral Band Selection for Detecting Fire Blight Disease in Pear Trees by Narrowband Hyperspectral Imagery (초분광 이미지를 이용한 배나무 화상병에 대한 최적 분광 밴드 선정)

  • Kang, Ye-Seong;Park, Jun-Woo;Jang, Si-Hyeong;Song, Hye-Young;Kang, Kyung-Suk;Ryu, Chan-Seok;Kim, Seong-Heon;Jun, Sae-Rom;Kang, Tae-Hwan;Kim, Gul-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.1
    • /
    • pp.15-33
    • /
    • 2021
  • In this study, the possibility of discriminating Fire blight (FB) infection tested using the hyperspectral imagery. The reflectance of healthy and infected leaves and branches was acquired with 5 nm of full width at high maximum (FWHM) and then it was standardized to 10 nm, 25 nm, 50 nm, and 80 nm of FWHM. The standardized samples were divided into training and test sets at ratios of 7:3, 5:5 and 3:7 to find the optimal bands of FWHM by the decision tree analysis. Classification accuracy was evaluated using overall accuracy (OA) and kappa coefficient (KC). The hyperspectral reflectance of infected leaves and branches was significantly lower than those of healthy green, red-edge (RE) and near infrared (NIR) regions. The bands selected for the first node were generally 750 and 800 nm; these were used to identify the infection of leaves and branches, respectively. The accuracy of the classifier was higher in the 7:3 ratio. Four bands with 50 nm of FWHM (450, 650, 750, and 950 nm) might be reasonable because the difference in the recalculated accuracy between 8 bands with 10 nm of FWHM (440, 580, 640, 660, 680, 710, 730, and 740 nm) and 4 bands was only 1.8% for OA and 4.1% for KC, respectively. Finally, adding two bands (550 nm and 800 nm with 25 nm of FWHM) in four bands with 50 nm of FWHM have been proposed to improve the usability of multispectral image sensors with performing various roles in agriculture as well as detecting FB with other combinations of spectral bands.

Biological Early Warning Systems using UChoo Algorithm (UChoo 알고리즘을 이용한 생물 조기 경보 시스템)

  • Lee, Jong-Chan;Lee, Won-Don
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.1
    • /
    • pp.33-40
    • /
    • 2012
  • This paper proposes a method to implement biological early warning systems(BEWS). This system generates periodically data event using a monitoring daemon and it extracts the feature parameters from this data sets. The feature parameters are derived with 6 variables, x/y coordinates, distance, absolute distance, angle, and fractal dimension. Specially by using the fractal dimension theory, the proposed algorithm define the input features represent the organism characteristics in non-toxic or toxic environment. And to find a moderate algorithm for learning the extracted feature data, the system uses an extended learning algorithm(UChoo) popularly used in machine learning. And this algorithm includes a learning method with the extended data expression to overcome the BEWS environment which the feature sets added periodically by a monitoring daemon. In this algorithm, decision tree classifier define class distribution information using the weight parameter in the extended data expression. Experimental results show that the proposed BEWS is available for environmental toxicity detection.