• 제목/요약/키워드: multiple classification analysis

검색결과 471건 처리시간 0.025초

Classification via principal differential analysis

  • Jang, Eunseong;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • 제28권2호
    • /
    • pp.135-150
    • /
    • 2021
  • We propose principal differential analysis based classification methods. Computations of squared multiple correlation function (RSQ) and principal differential analysis (PDA) scores are reviewed; in addition, we combine principal differential analysis results with the logistic regression for binary classification. In the numerical study, we compare the principal differential analysis based classification methods with functional principal component analysis based classification. Various scenarios are considered in a simulation study, and principal differential analysis based classification methods classify the functional data well. Gene expression data is considered for real data analysis. We observe that the PDA score based method also performs well.

분류와 회귀나무분석에 관한 소고 (Note on classification and regression tree analysis)

  • 임용빈;오만숙
    • 품질경영학회지
    • /
    • 제30권1호
    • /
    • pp.152-161
    • /
    • 2002
  • The analysis of large data sets with hundreds of thousands observations and thousands of independent variables is a formidable computational task. A less parametric method, capable of identifying important independent variables and their interactions, is a tree structured approach to regression and classification. It gives a graphical and often illuminating way of looking at data in classification and regression problems. In this paper, we have reviewed and summarized tile methodology used to construct a tree, multiple trees and the sequential strategy for identifying active compounds in large chemical databases.

Alternative accuracy for multiple ROC analysis

  • Hong, Chong Sun;Wu, Zhi Qiang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권6호
    • /
    • pp.1521-1530
    • /
    • 2014
  • The ROC analysis is considered for multiple class diagnosis. There exist many criteria to find optimal thresholds and measure the accuracy of diagnostic tests for k dimensional ROC analysis. In this paper, we proposed a diagnostic accuracy measure called the correct classification simple rate, which is defined as the summation of true rates for each classification distribution and expressed as a function of summation of sequential true rates for two consecutive distributions. This measure does not weight accuracy across categories by the category prevalence and is comparable across populations for multiple class diagnosis. It is found that this accuracy measure does not only have a relationship with Kolmogorov - Smirnov statistics, but also can be represented as a linear function of some optimal threshold criteria. With these facts, the suggested measure could be applied to test for comparing multiple distributions.

HOS 특징 벡터를 이용한 장애 음성 분류 성능의 향상 (Performance Improvement of Classification Between Pathological and Normal Voice Using HOS Parameter)

  • 이지연;정상배;최흥식;한민수
    • 대한음성학회지:말소리
    • /
    • 제66호
    • /
    • pp.61-72
    • /
    • 2008
  • This paper proposes a method to improve pathological and normal voice classification performance by combining multiple features such as auditory-based and higher-order features. Their performances are measured by Gaussian mixture models (GMMs) and linear discriminant analysis (LDA). The combination of multiple features proposed by the frame-based LDA method is shown to be an effective method for pathological and normal voice classification, with a 87.0% classification rate. This is a noticeable improvement of 17.72% compared to the MFCC-based GMM algorithm in terms of error reduction.

  • PDF

Multi-Frame Face Classification with Decision-Level Fusion based on Photon-Counting Linear Discriminant Analysis

  • Yeom, Seokwon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제14권4호
    • /
    • pp.332-339
    • /
    • 2014
  • Face classification has wide applications in security and surveillance. However, this technique presents various challenges caused by pose, illumination, and expression changes. Face recognition with long-distance images involves additional challenges, owing to focusing problems and motion blurring. Multiple frames under varying spatial or temporal settings can acquire additional information, which can be used to achieve improved classification performance. This study investigates the effectiveness of multi-frame decision-level fusion with photon-counting linear discriminant analysis. Multiple frames generate multiple scores for each class. The fusion process comprises three stages: score normalization, score validation, and score combination. Candidate scores are selected during the score validation process, after the scores are normalized. The score validation process removes bad scores that can degrade the final output. The selected candidate scores are combined using one of the following fusion rules: maximum, averaging, and majority voting. Degraded facial images are employed to demonstrate the robustness of multi-frame decision-level fusion in harsh environments. Out-of-focus and motion blurring point-spread functions are applied to the test images, to simulate long-distance acquisition. Experimental results with three facial data sets indicate the efficiency of the proposed decision-level fusion scheme.

An Approach to Applying Multiple Linear Regression Models by Interlacing Data in Classifying Similar Software

  • Lim, Hyun-il
    • Journal of Information Processing Systems
    • /
    • 제18권2호
    • /
    • pp.268-281
    • /
    • 2022
  • The development of information technology is bringing many changes to everyday life, and machine learning can be used as a technique to solve a wide range of real-world problems. Analysis and utilization of data are essential processes in applying machine learning to real-world problems. As a method of processing data in machine learning, we propose an approach based on applying multiple linear regression models by interlacing data to the task of classifying similar software. Linear regression is widely used in estimation problems to model the relationship between input and output data. In our approach, multiple linear regression models are generated by training on interlaced feature data. A combination of these multiple models is then used as the prediction model for classifying similar software. Experiments are performed to evaluate the proposed approach as compared to conventional linear regression, and the experimental results show that the proposed method classifies similar software more accurately than the conventional model. We anticipate the proposed approach to be applied to various kinds of classification problems to improve the accuracy of conventional linear regression.

Contribution to Improve Database Classification Algorithms for Multi-Database Mining

  • Miloudi, Salim;Rahal, Sid Ahmed;Khiat, Salim
    • Journal of Information Processing Systems
    • /
    • 제14권3호
    • /
    • pp.709-726
    • /
    • 2018
  • Database classification is an important preprocessing step for the multi-database mining (MDM). In fact, when a multi-branch company needs to explore its distributed data for decision making, it is imperative to classify these multiple databases into similar clusters before analyzing the data. To search for the best classification of a set of n databases, existing algorithms generate from 1 to ($n^2-n$)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification are subsets of clusters in the next classification), existing algorithms generate each classification independently, that is, without taking into account the use of clusters from the previous classification. Consequently, existing algorithms are time consuming, especially when the number of candidate classifications increases. To overcome the latter problem, we propose in this paper an efficient approach that represents the problem of classifying the multiple databases as a problem of identifying the connected components of an undirected weighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of our algorithm against existing works and that it overcomes the problem of increase in the execution time.

DEA의 교차효율성을 활용한 다기준 ABC 재고 분류 방법 연구 (Multi-Criteria ABC Inventory Classification Using the Cross-Efficiency Method in DEA)

  • 박재훈;배혜림;임성묵
    • 대한산업공학회지
    • /
    • 제37권4호
    • /
    • pp.358-366
    • /
    • 2011
  • Multi-criteria ABC inventory classification, which aims to classify inventory items by considering more than one criterion, is one of the most widely employed techniques for inventory control. The weighted linear optimization (WLO) model proposed by Ramanathan (2006) solves the problem of multi-criteria ABC inventory classification by generating a set of criterion weights for each inventory item and assigning a normalized score to the item for ABC analysis. However, the WLO model has some limitations. First, many inventory items can share the same optimal score, which can hinder a precise classification of inventory items. Second, the model allows too much flexibility in weighting multiple criteria; each item is allowed to choose its own weights so that it can maximize its score. As a result, if an item dominates the others in terms of a certain criterion, it may be classified into a higher class regardless of other criteria by assigning an overwhelming weight to the criterion. Consequently, an item with a high value in an unimportant criterion and low values in others may be inappropriately classified as class A, leading to an inaccurate classification of inventory items. To overcome these shortcomings, we extend the WLO model by using the cross-efficiency method in data envelopment analysis. We claim that the proposed model can provide a more reasonable and accurate classification of inventory items by mitigating the adverse effect of flexibility in the choice of weights and yielding a unique ordering of inventory items.

Movie Review Classification Based on a Multiple Classifier

  • Tsutsumi, Kimitaka;Shimada, Kazutaka;Endo, Tsutomu
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.481-488
    • /
    • 2007
  • In this paper, we propose a method to classify movie review documents into positive or negative opinions. There are several approaches to classify documents. The previous studies, however, used only a single classifier for the classification task. We describe a multiple classifier for the review document classification task. The method consists of three classifiers based on SVMs, ME and score calculation. We apply two voting methods and SVMs to the integration process of single classifiers. The integrated methods improved the accuracy as compared with the three single classifiers. The experimental results show the effectiveness of our method.

  • PDF

A Multi-Class Classifier of Modified Convolution Neural Network by Dynamic Hyperplane of Support Vector Machine

  • Nur Suhailayani Suhaimi;Zalinda Othman;Mohd Ridzwan Yaakub
    • International Journal of Computer Science & Network Security
    • /
    • 제23권11호
    • /
    • pp.21-31
    • /
    • 2023
  • In this paper, we focused on the problem of evaluating multi-class classification accuracy and simulation of multiple classifier performance metrics. Multi-class classifiers for sentiment analysis involved many challenges, whereas previous research narrowed to the binary classification model since it provides higher accuracy when dealing with text data. Thus, we take inspiration from the non-linear Support Vector Machine to modify the algorithm by embedding dynamic hyperplanes representing multiple class labels. Then we analyzed the performance of multi-class classifiers using macro-accuracy, micro-accuracy and several other metrics to justify the significance of our algorithm enhancement. Furthermore, we hybridized Enhanced Convolution Neural Network (ECNN) with Dynamic Support Vector Machine (DSVM) to demonstrate the effectiveness and efficiency of the classifier towards multi-class text data. We performed experiments on three hybrid classifiers, which are ECNN with Binary SVM (ECNN-BSVM), and ECNN with linear Multi-Class SVM (ECNN-MCSVM) and our proposed algorithm (ECNNDSVM). Comparative experiments of hybrid algorithms yielded 85.12 % for single metric accuracy; 86.95 % for multiple metrics on average. As for our modified algorithm of the ECNN-DSVM classifier, we reached 98.29 % micro-accuracy results with an f-score value of 98 % at most. For the future direction of this research, we are aiming for hyperplane optimization analysis.