• Title/Summary/Keyword: classification algorithms

Search Result 1,195, Processing Time 0.03 seconds

Evaluation on Performance of Accuracy for Analysis and Classification of Data Related to Industrial Accidents (산업재해 데이터의 분석 및 분류를 위한 정확도 성능 평가)

  • Leem Young-Moon;Ryu Chang-Hyun
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2006.04a
    • /
    • pp.51-56
    • /
    • 2006
  • Recently data mining techniques have been used for analysis and classification of data related to industrial accidents. The main objective of this study is to compare performance of algorithms for data analysis of industrial accidents and this paper provides a comparative analysis of 5 kinds of algorithms including CHAID, CART, C4.5, LR (Logistic Regression) and NN (Neural Network) with ROC chart, lift chart and response threshold. In this study, data on 67,278 accidents were analyzed to create risk groups for a number of complications, including the risk of disease and accident. The sample for this work chosen from data related to manufacturing industries during three years $(2002\sim2004)$ in korea. According to the result analysis, NN has excellent performance for data analysis and classification of industrial accidents.

  • PDF

Performance Comparison of Decision Trees of J48 and Reduced-Error Pruning

  • Jin, Hoon;Jung, Yong Gyu
    • International journal of advanced smart convergence
    • /
    • v.5 no.1
    • /
    • pp.30-33
    • /
    • 2016
  • With the advent of big data, data mining is more increasingly utilized in various decision-making fields by extracting hidden and meaningful information from large amounts of data. Even as exponential increase of the request of unrevealing the hidden meaning behind data, it becomes more and more important to decide to select which data mining algorithm and how to use it. There are several mainly used data mining algorithms in biology and clinics highlighted; Logistic regression, Neural networks, Supportvector machine, and variety of statistical techniques. In this paper it is attempted to compare the classification performance of an exemplary algorithm J48 and REPTree of ML algorithms. It is confirmed that more accurate classification algorithm is provided by the performance comparison results. More accurate prediction is possible with the algorithm for the goal of experiment. Based on this, it is expected to be relatively difficult visually detailed classification and distinction.

A Construction of Fuzzy Model for Data Mining (데이터 마이닝을 위한 퍼지 모델 동정)

  • Kim, Do-Wan;Park, Jin-Bae;Kim, Jung-Chan;Joo, Young-Hoon
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.191-194
    • /
    • 2002
  • In this paper, a new GA-based methodology with information granules is suggested for construction of the fuzzy classifier. We deal with the selection of the fuzzy region as well as two major classification problems-the feature selection and the pattern classification. The proposed method consists of three steps: the selection of the fuzzy region, the construction of the fuzzy sets, and the tuning of the fuzzy rules. The genetic algorithms (GAs) are applied to the development of the information granules so as to decide the satisfactory fuzzy regions. Finally, the GAs are also applied to the tuning procedure of the fuzzy rules in terms of the management of the misclassified data (e.g., data with the strange pattern or on the boundaries of the classes). To show the effectiveness of the proposed method, an example-the classification of the Iris data, is provided.

Korean Voice Phishing Text Classification Performance Analysis Using Machine Learning Techniques (머신러닝 기법을 이용한 한국어 보이스피싱 텍스트 분류 성능 분석)

  • Boussougou, Milandu Keith Moussavou;Jin, Sangyoon;Chang, Daeho;Park, Dong-Joo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.297-299
    • /
    • 2021
  • Text classification is one of the popular tasks in Natural Language Processing (NLP) used to classify text or document applications such as sentiment analysis and email filtering. Nowadays, state-of-the-art (SOTA) Machine Learning (ML) and Deep Learning (DL) algorithms are the core engine used to perform these classification tasks with high accuracy, and they show satisfying results. This paper conducts a benchmarking performance's analysis of multiple SOTA algorithms on the first known labeled Korean voice phishing dataset called KorCCVi. Experimental results reveal performed on a test set of 366 samples reveal which algorithm performs the best considering the training time and metrics such as accuracy and F1 score.

Musical Genre Classification Based on Deep Residual Auto-Encoder and Support Vector Machine

  • Xue Han;Wenzhuo Chen;Changjian Zhou
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.13-23
    • /
    • 2024
  • Music brings pleasure and relaxation to people. Therefore, it is necessary to classify musical genres based on scenes. Identifying favorite musical genres from massive music data is a time-consuming and laborious task. Recent studies have suggested that machine learning algorithms are effective in distinguishing between various musical genres. However, meeting the actual requirements in terms of accuracy or timeliness is challenging. In this study, a hybrid machine learning model that combines a deep residual auto-encoder (DRAE) and support vector machine (SVM) for musical genre recognition was proposed. Eight manually extracted features from the Mel-frequency cepstral coefficients (MFCC) were employed in the preprocessing stage as the hybrid music data source. During the training stage, DRAE was employed to extract feature maps, which were then used as input for the SVM classifier. The experimental results indicated that this method achieved a 91.54% F1-score and 91.58% top-1 accuracy, outperforming existing approaches. This novel approach leverages deep architecture and conventional machine learning algorithms and provides a new horizon for musical genre classification tasks.

Plain Fingerprint Classification Based on a Core Stochastic Algorithm

  • Baek, Young-Hyun;Kim, Byunggeun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.1
    • /
    • pp.43-48
    • /
    • 2016
  • We propose plain fingerprint classification based on a core stochastic algorithm that effectively uses a core stochastic model, acquiring more fingerprint minutiae and direction, in order to increase matching performance. The proposed core stochastic algorithm uses core presence/absence and contains a ridge direction and distribution map. Simulations show that the fingerprint classification accuracy is improved by more than 14%, on average, compared to other algorithms.

Classification of the vegetated terrain using polarimetric SAR processing techniques

  • Park Sang-Eun;Moon Wooil M
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.389-392
    • /
    • 2004
  • Classification of Earth natural components within a full polarimetric SAR image is one of the most important applications of radar polarimetry in remote sensing. In this paper, the unsupervised classification algorithms based on the combined use of the polarimetric processing technique such as the target decomposition and statistical complex Wishart classification method are evaluated and applied to vegetated terrain in Jeju volcanic island.

  • PDF

Classification of Multi Spectral Image Data using Rough Sets (러프 집합을 이용한 다중 분광 이미지 데이터의 분류)

  • 원성현;이병성;정환묵
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1997.11a
    • /
    • pp.205-208
    • /
    • 1997
  • Traditionally, classification of remote sensed image data is one of the important works for image data analysis procedure. So, many researchers devote their endeavor to increasing accuracy of analysis, also, many classification algorithms have been proposed. In this paper, we propose new classification method for remote sensed image data that use rough set theory. Using indiscernibility relation of rough sets, we show that can classify image data very easily.

  • PDF

An Improved Text Classification (향상된 텍스트 분류)

  • Wang, Guangxing;Shin, Seong-Yoon;Shin, Kwang-Weong;Lee, Hyun-Chang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.125-126
    • /
    • 2019
  • In this paper, we propose an improved kNN classification method. Through improved the mothed and normalizing the data, the purpose of improving the accuracy is achieved. Then we compared the three classification algorithms and the improved algorithm by experimental data.

  • PDF

On the development of data-based damage diagnosis algorithms for structural health monitoring

  • Kiremidjian, Anne S.
    • Smart Structures and Systems
    • /
    • v.30 no.3
    • /
    • pp.263-271
    • /
    • 2022
  • In this paper we present an overview of damage diagnosis algorithms that have been developed over the past two decades using vibration signals obtained from structures. Then, the paper focuses primarily on algorithms that can be used following an extreme event such as a large earthquake to identify structural damage for responding in a timely manner. The algorithms presented in the paper use measurements obtained from accelerometers and gyroscope to identify the occurrence of damage and classify the damage. Example algorithms are presented include those based on autoregressive moving average (ARMA), wavelet energies from wavelet transform and rotation models. The algorithms are illustrated through application of data from test structures such as the ASCE Benchmark structure and laboratory tests of scaled bridge columns and steel frames. The paper concludes by identifying needs for research and development in order for such algorithms to become viable in practice.