• Title/Summary/Keyword: tree classification method

Search Result 361, Processing Time 0.024 seconds

Classification and Characteristics of Forest Community in Seodaesan, Geumsan (금산 서대산의 임분 특성 및 군락 분류)

  • Ji, Yun-Ui;Song, Ho-Kyung
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.7 no.5
    • /
    • pp.38-46
    • /
    • 2004
  • This study was carried out to analyze forest vegetation in Seodaesan of Geumsan, Chungnam Province. Employing the releve method of Braun-Blanquet and quadrat method, 36 plots were sampled in forest of Seodaesan. The sub-communities were classified into Pinus densiflora, Acer pseudosieboldianum, and Carpinus laxiflora sub-community of Quercus mongolica community. The importance values were 77.07 in Quercus mongolica, 40.79 in Pinus densiflora, 17.03 Fraxinus rhynchophylla, 14.06 in Fraxinus sieboldiana, 13.99 in Quercus serrata, 12.93 Acer pseudosiebotdianum. Coverage rate was 84.6% in tree layer, 52.8% in subtree layer, 29.1% in shrub layer, 27.9% in herb layer, respectively. Most of the DBH of Quercus mongolica and Pinus densiflora was between 5cm and 20cm. Therefore, Quercus mongolica and Pinus densiflora might be dominant species in the study area for several decades. Acer pseudosieboldianum and Carpinus laxiflora sub-communities were distributed mainly in a high-altitude and northern and north-western area. Pinus densiflora sub-community was distributed mainly in a low-altitude and western area.

유전자 알고리즘을 활용한 데이터 불균형 해소 기법의 조합적 활용

  • Jang, Yeong-Sik;Kim, Jong-U;Heo, Jun
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2007.05a
    • /
    • pp.309-320
    • /
    • 2007
  • The data imbalance problem which can be uncounted in data mining classification problems typically means that there are more or less instances in a class than those in other classes. It causes low prediction accuracy of the minority class because classifiers tend to assign instances to major classes and ignore the minor class to reduce overall misclassification rate. In order to solve the data imbalance problem, there has been proposed a number of techniques based on resampling with replacement, adjusting decision thresholds, and adjusting the cost of the different classes. In this paper, we study the feasibility of the combination usage of the techniques previously proposed to deal with the data imbalance problem, and suggest a combination method using genetic algorithm to find the optimal combination ratio of the techniques. To improve the prediction accuracy of a minority class, we determine the combination ratio based on the F-value of the minority class as the fitness function of genetic algorithm. To compare the performance with those of single techniques and the matrix-style combination of random percentage, we performed experiments using four public datasets which has been generally used to compare the performance of methods for the data imbalance problem. From the results of experiments, we can find the usefulness of the proposed method.

  • PDF

Text-independent Speaker Identification by Bagging VQ Classifier

  • Kyung, Youn-Jeong;Park, Bong-Dae;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2E
    • /
    • pp.17-24
    • /
    • 2001
  • In this paper, we propose the bootstrap and aggregating (bagging) vector quantization (VQ) classifier to improve the performance of the text-independent speaker recognition system. This method generates multiple training data sets by resampling the original training data set, constructs the corresponding VQ classifiers, and then integrates the multiple VQ classifiers into a single classifier by voting. The bagging method has been proven to greatly improve the performance of unstable classifiers. Through two different experiments, this paper shows that the VQ classifier is unstable. In one of these experiments, the bias and variance of a VQ classifier are computed with a waveform database. The variance of the VQ classifier is compared with that of the classification and regression tree (CART) classifier[1]. The variance of the VQ classifier is shown to be as large as that of the CART classifier. The other experiment involves speaker recognition. The speaker recognition rates vary significantly by the minor changes in the training data set. The speaker recognition experiments involving a closed set, text-independent and speaker identification are performed with the TIMIT database to compare the performance of the bagging VQ classifier with that of the conventional VQ classifier. The bagging VQ classifier yields improved performance over the conventional VQ classifier. It also outperforms the conventional VQ classifier in small training data set problems.

  • PDF

Printed Hangul Recognition with Adaptive Hierarchical Structures Depending on 6-Types (6-유형 별로 적응적 계층 구조를 갖는 인쇄 한글 인식)

  • Ham, Dae-Sung;Lee, Duk-Ryong;Choi, Kyung-Ung;Oh, Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.1
    • /
    • pp.10-18
    • /
    • 2010
  • Due to a large number of classes in Hangul character recognition, it is usual to use the six-type preclassification stage. After the preclassification, the first consonent, vowel, and last consonent can be classified separately. Though each of three components has a few of classes, classification errors occurs often due to shape similarity such as 'ㅔ' and 'ㅖ'. So this paper proposes a hierarchical recognition method which adopts multi-stage tree structures for each of 6-types. In addition, to reduce the interference among three components, the method uses the recognition results of first consonents and vowel as features of vowel classifier. The recognition accuracy for the test set of PHD08 database was 98.96%.

Feature Analysis on Industrial Accidents of Manufacturing Businesses Using QUEST Algorithm

  • Leem, Young-Moon;Rogers, K.J.;Hwang, Young-Seob
    • International Journal of Safety
    • /
    • v.5 no.1
    • /
    • pp.37-41
    • /
    • 2006
  • The major objective of the statistical analysis about industrial accidents is to determine the safety factors so that it is possible to prevent or decrease the number of future accidents by educating those who work in a given industrial field in safety management. So far, however, there exists no quantitative method for evaluating danger related to industrial accidents. Therefore, as a method for developing quantitative evaluation technique, this study presents feature analysis of industrial accidents in manufacturing field using QUEST algorithm. In order to analyze features of industrial accidents, a retrospective analysis was performed on 10,536 subjects (10,313 injured people, 223 deaths). The sample for this work was chosen from data related to manufacturing businesses during a three-year period ($2002{\sim}2004$) in Korea. This study used AnswerTree of SPSS and the analysis results enabled us to determine the most important variables that can affect injured people such as the occurrence type, the company size, and the time of occurrence. Also, it was found that the classification system adopted in the present study using QUEST algorithm is quite reliable.

Comparative Analysis of Machine Learning Algorithms for Healthy Management of Collaborative Robots (협동로봇의 건전성 관리를 위한 머신러닝 알고리즘의 비교 분석)

  • Kim, Jae-Eun;Jang, Gil-Sang;Lim, KuK-Hwa
    • Journal of the Korea Safety Management & Science
    • /
    • v.23 no.4
    • /
    • pp.93-104
    • /
    • 2021
  • In this paper, we propose a method for diagnosing overload and working load of collaborative robots through performance analysis of machine learning algorithms. To this end, an experiment was conducted to perform pick & place operation while changing the payload weight of a cooperative robot with a payload capacity of 10 kg. In this experiment, motor torque, position, and speed data generated from the robot controller were collected, and as a result of t-test and f-test, different characteristics were found for each weight based on a payload of 10 kg. In addition, to predict overload and working load from the collected data, machine learning algorithms such as Neural Network, Decision Tree, Random Forest, and Gradient Boosting models were used for experiments. As a result of the experiment, the neural network with more than 99.6% of explanatory power showed the best performance in prediction and classification. The practical contribution of the proposed study is that it suggests a method to collect data required for analysis from the robot without attaching additional sensors to the collaborative robot and the usefulness of a machine learning algorithm for diagnosing robot overload and working load.

Steel Plate Faults Diagnosis with S-MTS (S-MTS를 이용한 강판의 표면 결함 진단)

  • Kim, Joon-Young;Cha, Jae-Min;Shin, Junguk;Yeom, Choongsub
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.47-67
    • /
    • 2017
  • Steel plate faults is one of important factors to affect the quality and price of the steel plates. So far many steelmakers generally have used visual inspection method that could be based on an inspector's intuition or experience. Specifically, the inspector checks the steel plate faults by looking the surface of the steel plates. However, the accuracy of this method is critically low that it can cause errors above 30% in judgment. Therefore, accurate steel plate faults diagnosis system has been continuously required in the industry. In order to meet the needs, this study proposed a new steel plate faults diagnosis system using Simultaneous MTS (S-MTS), which is an advanced Mahalanobis Taguchi System (MTS) algorithm, to classify various surface defects of the steel plates. MTS has generally been used to solve binary classification problems in various fields, but MTS was not used for multiclass classification due to its low accuracy. The reason is that only one mahalanobis space is established in the MTS. In contrast, S-MTS is suitable for multi-class classification. That is, S-MTS establishes individual mahalanobis space for each class. 'Simultaneous' implies comparing mahalanobis distances at the same time. The proposed steel plate faults diagnosis system was developed in four main stages. In the first stage, after various reference groups and related variables are defined, data of the steel plate faults is collected and used to establish the individual mahalanobis space per the reference groups and construct the full measurement scale. In the second stage, the mahalanobis distances of test groups is calculated based on the established mahalanobis spaces of the reference groups. Then, appropriateness of the spaces is verified by examining the separability of the mahalanobis diatances. In the third stage, orthogonal arrays and Signal-to-Noise (SN) ratio of dynamic type are applied for variable optimization. Also, Overall SN ratio gain is derived from the SN ratio and SN ratio gain. If the derived overall SN ratio gain is negative, it means that the variable should be removed. However, the variable with the positive gain may be considered as worth keeping. Finally, in the fourth stage, the measurement scale that is composed of selected useful variables is reconstructed. Next, an experimental test should be implemented to verify the ability of multi-class classification and thus the accuracy of the classification is acquired. If the accuracy is acceptable, this diagnosis system can be used for future applications. Also, this study compared the accuracy of the proposed steel plate faults diagnosis system with that of other popular classification algorithms including Decision Tree, Multi Perception Neural Network (MLPNN), Logistic Regression (LR), Support Vector Machine (SVM), Tree Bagger Random Forest, Grid Search (GS), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The steel plates faults dataset used in the study is taken from the University of California at Irvine (UCI) machine learning repository. As a result, the proposed steel plate faults diagnosis system based on S-MTS shows 90.79% of classification accuracy. The accuracy of the proposed diagnosis system is 6-27% higher than MLPNN, LR, GS, GA and PSO. Based on the fact that the accuracy of commercial systems is only about 75-80%, it means that the proposed system has enough classification performance to be applied in the industry. In addition, the proposed system can reduce the number of measurement sensors that are installed in the fields because of variable optimization process. These results show that the proposed system not only can have a good ability on the steel plate faults diagnosis but also reduce operation and maintenance cost. For our future work, it will be applied in the fields to validate actual effectiveness of the proposed system and plan to improve the accuracy based on the results.

Ecological Studies on Several Forest Communities in Kwangnung. A Study of the Site Index and the ground vegetation of Larch (광릉삼림의 생태학적 연구 낙엽송의 Site Index와 임상식생에 관하여)

  • 차종환
    • Journal of Plant Biology
    • /
    • v.9 no.1_2
    • /
    • pp.7-16
    • /
    • 1966
  • In order to determine the factors related to site quality, 13 areas of Larch growing in the Kwangung and its vicinity forest as sample plots, were examined. Sample plots included various site classes as well as age classes. Three were divided into two groups (major and minor trees). Average height of dominant trees was determined through messurement of 5 to 6 dominant tree in each sample plots. Average height of dominant 30 year-old trees was the basis for site index. A Standard Yield Table for the larch produced in Kwangnung forest was made by various data, which included age class 5, ranging from 10 to 45 years. The relationship of the height of the trees, the site conditions, and ground vegetation are investigated in this paper. The site indexes of 40 forest class age in 28-B and 28-G forest classes of the larch associations for ground vegetation had comparatively rarge differences due to the sampled areas. The relation of the direction of forest communities to the height and the diameter of the tree shwoed that its communiteis of northest and northwest parts appeared higher valueof the height and the diameter. The diameter and the height of trees were closely realted to each other. The samller the occupied area per tree and the smaller the average distance among trees, the more density was increased. The larger the density was the lower height of the trees. In the ground vegetation of the larch communities, there seems to be a definite correlation between the height of trees and the occupied area per tree or the average distance among the trees. The height of trees and site index of two larch communities were as follow: 28-B forest class site index 20.8, height 24.0m, 28-G forest class site index 18.4, height 20.9m. The ground layer was analyzed by the method of Quadrat(20/20sq. cm) with an interval of 1M. It set up 40 Quadrats of the larch communiteis. The community structure of the ground vegetation of two larch was analyzed, and important value was calculated and then evaluated. The ground vegetation under the larch had developed Burmannii Beauv stratal society below the 28-B and 28-G the forest class. Accordingly, the first important value of Burmannii Beauv was found in two ground vegetation below the larch. Therefore, this species could be quantitatively considered as the forest indicator species. Common species of each community appeared 18 species out of 34 species in the ground vegetation under two larch communities. The ground vegetation of the 28-B forest class showed more than that of the 28-G forest class. the similarity of the ground vegetation was measrued by the Frequency Index Community Coefficient. The differences between the associations were lcearly manifested by the ground vegetation tested by Gleason's Frequency Index of Community Coefficient for the analysis of each stratal society of all associations. According to F.I.C.C. the ground vegetation under two larch(28-B and 28-G) forest classes showed higher value. An investigation into the relationship of physical and chemical properties of soil and site was considered the next step to be taken in the study of the larch site classification.

  • PDF

A Machine Learning Approach for Mechanical Motor Fault Diagnosis (기계적 모터 고장진단을 위한 머신러닝 기법)

  • Jung, Hoon;Kim, Ju-Won
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.1
    • /
    • pp.57-64
    • /
    • 2017
  • In order to reduce damages to major railroad components, which have the potential to cause interruptions to railroad services and safety accidents and to generate unnecessary maintenance costs, the development of rolling stock maintenance technology is switching from preventive maintenance based on the inspection period to predictive maintenance technology, led by advanced countries. Furthermore, to enhance trust in accordance with the speedup of system and reduce maintenances cost simultaneously, the demand for fault diagnosis and prognostic health management technology is increasing. The objective of this paper is to propose a highly reliable learning model using various machine learning algorithms that can be applied to critical rolling stock components. This paper presents a model for railway rolling stock component fault diagnosis and conducts a mechanical failure diagnosis of motor components by applying the machine learning technique in order to ensure efficient maintenance support along with a data preprocessing plan for component fault diagnosis. This paper first defines a failure diagnosis model for rolling stock components. Function-based algorithms ANFIS and SMO were used as machine learning techniques for generating the failure diagnosis model. Two tree-based algorithms, RadomForest and CART, were also employed. In order to evaluate the performance of the algorithms to be used for diagnosing failures in motors as a critical railroad component, an experiment was carried out on 2 data sets with different classes (includes 6 classes and 3 class levels). According to the results of the experiment, the random forest algorithm, a tree-based machine learning technique, showed the best performance.

Automatic Identification of Database Workloads by using SVM Workload Classifier (SVM 워크로드 분류기를 통한 자동화된 데이터베이스 워크로드 식별)

  • Kim, So-Yeon;Roh, Hong-Chan;Park, Sang-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.84-90
    • /
    • 2010
  • DBMS is used for a range of applications from data warehousing through on-line transaction processing. As a result of this demand, DBMS has continued to grow in terms of its size. This growth invokes the most important issue of manually tuning the performance of DBMS. The DBMS tuning should be adaptive to the type of the workload put upon it. But, identifying workloads in mixed database applications might be quite difficult. Therefore, a method is necessary for identifying workloads in the mixed database environment. In this paper, we propose a SVM workload classifier to automatically identify a DBMS workload. Database workloads are collected in TPC-C and TPC-W benchmark while changing the resource parameters. Parameters for SVM workload classifier, C and kernel parameter, were chosen experimentally. The experiments revealed that the accuracy of the proposed SVM workload classifier is about 9% higher than that of Decision tree, Naive Bayes, Multilayer perceptron and K-NN classifier.