• Title/Summary/Keyword: tree classification

Search Result 930, Processing Time 0.02 seconds

Predicting Discharge Rate of After-care patient using Hierarchy Analysis

  • Jung, Yong Gyu;Kim, Hee-Wan;Kang, Min Soo
    • International Journal of Advanced Culture Technology
    • /
    • v.4 no.2
    • /
    • pp.38-42
    • /
    • 2016
  • In the growing data saturated world, the question of "whether data can be used" has shifted to "can it be utilized effectively?" More data is being generated and utilized than ever before. As the collection of data increases, data mining techniques also must become more and more accurate. Thus, to ensure this data is effectively utilized, the analysis of the data must be efficient. Interpretation of results from the analysis of the data set presented, have their own on the basis it is possible to obtain the desired data. In the data mining method a decision tree, clustering, there is such a relationship has not yet been fully developed algorithm actually still impact of various factors. In this experiment, the classification method of data mining techniques is used with easy decision tree. Also, it is used special technology of one R and J48 classification technique in the decision tree. After selecting a rule that a small error in the "one rule" in one R classification, to create one of the rules of the prediction data, it is simple and accurate classification algorithm. To create a rule for the prediction, we make up a frequency table of each prediction of the goal. This is then displayed by creating rules with one R, state-of-the-art, classification algorithm while creating a simple rule to be interpreted by the researcher. While the following can be correctly classified the pattern specified in the classification J48, using the concept of a simple decision tree information theory for configuring information theory. To compare the one R algorithm, it can be analyzed error rate and accuracy. One R and J48 are generally frequently used two classifications${\ldots}$

A Comparison of Pixel- and Segment-based Classification for Tree Species Classification using QuickBird Imagery (QuickBird 위성영상을 이용한 수종분류에서 픽셀과 분할기반 분류방법의 정확도 비교)

  • Chung, Sang Young;Yim, Jong Su;Shin, Man Yong
    • Journal of Korean Society of Forest Science
    • /
    • v.100 no.4
    • /
    • pp.540-547
    • /
    • 2011
  • This study was conducted to compare classification accuracy by tree species using QuickBird imagery for pixel- and segment-based classifications that have been mostly applied to classify land covers. A total of 398 points was used as training and reference data. Based on this points, the points were classified into fourteen land cover classes: four coniferous and seven deciduous tree species in forest classes, and three non-forested classes. In pixel-based classification, three images obtained by using raw spectral values, three tasseled indices, and three components from principal component analysis were produced. For the both classification processes, the maximum likelihood method was applied. In the pixel-based classification, it was resulted that the classification accuracy with raw spectral values was better than those by the other band combinations. As resulted that, the segment-based classification with a scale factor of 50% provided the most accurate classification (overall accuracy:76% and ${\hat{k}}$ value:0.74) compared to the other scale factors and pixel-based classification.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensemble

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.5
    • /
    • pp.617-625
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

The Generation of Test Case Flow Using Classification Tree Method and Functional Analysis for River Crossing of Wheeled-Vehicle (분류트리기법(CTM)과 기능분석을 활용한 차륜형 전투차량 수상운행 테스트 케이스 플로우 생성에 관한 연구)

  • Lee, In Ho;Lee, Cheol Woo;Park, Tae Woo;Nam, Hae Sung;Kang, Ho Sin;Kim, Eui Whan
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.10 no.1
    • /
    • pp.73-80
    • /
    • 2014
  • Designing test case flows for water crossing operation of a wheeled vehicle is a new attempt for which very limited experiences exist. In this paper, a Function Flow Block Diagram(FFBD) and a Classification Tree Method(CTM) were combined to see if this method is viable to generate the test case flows at the functional analysis stage. It was found that this method can be practically used for the very complicated test case generation.

The Prediction Performance of the CART Using Bank and Insurance Company Data (CART의 예측 성능:은행 및 보험 회사 데이터 사용)

  • Park, Jeong-Seon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1468-1472
    • /
    • 1996
  • In this study, the performance of the CART(Classification and Regression Tree) is compared with that of discriminant analysis method. In most experiments using bank data, discriminant analysis shows better performance in terms of the total cost. In contrast, most experiments using insurance data show that the CART is better than discriminant analysis in terms of the total cost. The contradictory result are analysed by using the characteristics of the data sets. The performances of both the Classification and Regression Tree and discriminant analysis depend on the parameters:failure prior probability, data used, type I error, type II error cost, and validation method.

  • PDF

Integrity Assessment for Reinforced Concrete Structures Using Fuzzy Decision Making (퍼지의사결정을 이용한 RC구조물의 건전성평가)

  • 박철수;손용우;이증빈
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2002.04a
    • /
    • pp.274-283
    • /
    • 2002
  • This paper presents an efficient models for reinforeced concrete structures using CART-ANFIS(classification and regression tree-adaptive neuro fuzzy inference system). a fuzzy decision tree parttitions the input space of a data set into mutually exclusive regions, each of which is assigned a label, a value, or an action to characterize its data points. Fuzzy decision trees used for classification problems are often called fuzzy classification trees, and each terminal node contains a label that indicates the predicted class of a given feature vector. In the same vein, decision trees used for regression problems are often called fuzzy regression trees, and the terminal node labels may be constants or equations that specify the Predicted output value of a given input vector. Note that CART can select relevant inputs and do tree partitioning of the input space, while ANFIS refines the regression and makes it everywhere continuous and smooth. Thus it can be seen that CART and ANFIS are complementary and their combination constitutes a solid approach to fuzzy modeling.

  • PDF

A Comparative Study of Phishing Websites Classification Based on Classifier Ensembles

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Multimedia Information System
    • /
    • v.5 no.2
    • /
    • pp.99-104
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

Classification Tree-Based Feature-Selective Clustering Analysis: Case of Credit Card Customer Segmentation (분류나무를 활용한 군집분석의 입력특성 선택: 신용카드 고객세분화 사례)

  • Yoon Hanseong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.1-11
    • /
    • 2023
  • Clustering analysis is used in various fields including customer segmentation and clustering methods such as k-means are actively applied in the credit card customer segmentation. In this paper, we summarized the input features selection method of k-means clustering for the case of the credit card customer segmentation problem, and evaluated its feasibility through the analysis results. By using the label values of k-means clustering results as target features of a decision tree classification, we composed a method for prioritizing input features using the information gain of the branch. It is not easy to determine effectiveness with the clustering effectiveness index, but in the case of the CH index, cluster effectiveness is improved evidently in the method presented in this paper compared to the case of randomly determining priorities. The suggested method can be used for effectiveness of actively used clustering analysis including k-means method.

An Efficient One Class Classifier Using Gaussian-based Hyper-Rectangle Generation (가우시안 기반 Hyper-Rectangle 생성을 이용한 효율적 단일 분류기)

  • Kim, Do Gyun;Choi, Jin Young;Ko, Jeonghan
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.2
    • /
    • pp.56-64
    • /
    • 2018
  • In recent years, imbalanced data is one of the most important and frequent issue for quality control in industrial field. As an example, defect rate has been drastically reduced thanks to highly developed technology and quality management, so that only few defective data can be obtained from production process. Therefore, quality classification should be performed under the condition that one class (defective dataset) is even smaller than the other class (good dataset). However, traditional multi-class classification methods are not appropriate to deal with such an imbalanced dataset, since they classify data from the difference between one class and the others that can hardly be found in imbalanced datasets. Thus, one-class classification that thoroughly learns patterns of target class is more suitable for imbalanced dataset since it only focuses on data in a target class. So far, several one-class classification methods such as one-class support vector machine, neural network and decision tree there have been suggested. One-class support vector machine and neural network can guarantee good classification rate, and decision tree can provide a set of rules that can be clearly interpreted. However, the classifiers obtained from the former two methods consist of complex mathematical functions and cannot be easily understood by users. In case of decision tree, the criterion for rule generation is ambiguous. Therefore, as an alternative, a new one-class classifier using hyper-rectangles was proposed, which performs precise classification compared to other methods and generates rules clearly understood by users as well. In this paper, we suggest an approach for improving the limitations of those previous one-class classification algorithms. Specifically, the suggested approach produces more improved one-class classifier using hyper-rectangles generated by using Gaussian function. The performance of the suggested algorithm is verified by a numerical experiment, which uses several datasets in UCI machine learning repository.

Classification of COVID-19 Disease: A Machine Learning Perspective

  • Kinza Sardar
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.3
    • /
    • pp.107-112
    • /
    • 2024
  • Nowadays the deadly virus famous as COVID-19 spread all over the world starts from the Wuhan China in 2019. This disease COVID-19 Virus effect millions of people in very short time. There are so many symptoms of COVID19 perhaps the Identification of a person infected with COVID-19 virus is really a difficult task. Moreover it's a challenging task to identify whether a person or individual have covid test positive or negative. We are developing a framework in which we used machine learning techniques..The proposed method uses DecisionTree, KNearestNeighbors, GaussianNB, LogisticRegression, BernoulliNB , RandomForest , Machine Learning methods as the classifier for diagnosis of covid ,however, 5-fold and 10-fold cross-validations were applied through the classification process. The experimental results showed that the best accuracy obtained from Decision Tree classifiers. The data preprocessing techniques have been applied for improving the classification performance. Recall, accuracy, precision, and F-score metrics were used to evaluate the classification performance. In future we will improve model accuracy more than we achieved now that is 93 percent by applying different techniques