• 제목/요약/키워드: CLASSIFICATION ANALYSIS

Search Result 8,013, Processing Time 0.034 seconds

Random projection ensemble adaptive nearest neighbor classification (랜덤 투영 앙상블 기법을 활용한 적응 최근접 이웃 판별분류기법)

  • Kang, Jongkyeong;Jhun, Myoungshic
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.401-410
    • /
    • 2021
  • Popular in discriminant classification analysis, k-nearest neighbor classification methods have limitations that do not reflect the local characteristic of the data, considering only the number of fixed neighbors. Considering the local structure of the data, the adaptive nearest neighbor method has been developed to select the number of neighbors. In the analysis of high-dimensional data, it is common to perform dimension reduction such as random projection techniques before using k-nearest neighbor classification. Recently, an ensemble technique has been developed that carefully combines the results of such random classifiers and makes final assignments by voting. In this paper, we propose a novel discriminant classification technique that combines adaptive nearest neighbor methods with random projection ensemble techniques for analysis on high-dimensional data. Through simulation and real-world data analyses, we confirm that the proposed method outperforms in terms of classification accuracy compared to the previously developed methods.

Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA

  • Jeon, Dong-Ha;Lee, Soo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.123-130
    • /
    • 2022
  • Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size.

Stress Level Based Emotion Classification Using Hybrid Deep Learning Algorithm

  • Sivasankaran Pichandi;Gomathy Balasubramanian;Venkatesh Chakrapani
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.11
    • /
    • pp.3099-3120
    • /
    • 2023
  • The present fast-moving era brings a serious stress issue that affects elders and youngsters. Everyone has undergone stress factors at least once in their lifetime. Stress is more among youngsters as they are new to the working environment. whereas the stress factors for elders affect the individual and overall performance in an organization. Electroencephalogram (EEG) based stress level classification is one of the widely used methodologies for stress detection. However, the signal processing methods evolved so far have limitations as most of the stress classification models compute the stress level in a predefined environment to detect individual stress factors. Specifically, machine learning based stress classification models requires additional algorithm for feature extraction which increases the computation cost. Also due to the limited feature learning characteristics of machine learning algorithms, the classification performance reduces and inaccurate sometimes. It is evident from numerous research works that deep learning models outperforms machine learning techniques. Thus, to classify all the emotions based on stress level in this research work a hybrid deep learning algorithm is presented. Compared to conventional deep learning models, hybrid models outperforms in feature handing. Better feature extraction and selection can be made through deep learning models. Adding machine learning classifiers in deep learning architecture will enhance the classification performances. Thus, a hybrid convolutional neural network model was presented which extracts the features using CNN and classifies them through machine learning support vector machine. Simulation analysis of benchmark datasets demonstrates the proposed model performances. Finally, existing methods are comparatively analyzed to demonstrate the better performance of the proposed model as a result of the proposed hybrid combination.

도서분류자동화를 위한 지식베이스의 설계에 관한 연구

  • 이경호
    • Journal of Korean Library and Information Science Society
    • /
    • v.18
    • /
    • pp.139-192
    • /
    • 1991
  • Though the computer has become deeply entrenched as the major tool in information processing(library works), it may be obvious that automatic book classification techniques ate still under experimentation, and the techniques have not yet been tested against the criterion of usefulness. The purpose of this study is to design of knowledge base for automatic book classification which can be put to use in library operation, and to present a methodology of application of the automatic classification into the library. Since the enumerative classification schemes which are existing are manual systems, it cannot be applied to the automatic classification, the principle of faceted classification based on concept analysis is brought in and studied. The result of this study are summarized as follows : 1. The design of knowledge base confined the field of agriculture and medicine. 2. If title is entered by the computer keyboard it will be searched in knowledge base, and then be classified by the principle of automatic classification. 3. Program flowcharts are designed as a bases of classification procedures for automatic subject recognition and classification. 4. 283 books in agriculture, 196 books in medicine were drawn at random from Taegu University Library and Young-Nal Medical Center Library respectively. 5. The experiment of automatic classification is performed 143 books in agriculture 166 books in medicine except for other subject books. 6. It was proved that automatic book classification is possible by design of knowledge base. In addition the expected values from design of knowledge base for automatic book classification are as follows : 1. The prompt and accurate process of classification is possible. 2. Though some title is classified in any library, it can be classified the some classification number by a program. 3. The user can retrieve the classification codes of books for which he or she wants to search through the computer. 4. Since the concept coordination method is employed the representing of a multisubject concept is make simple. 5. By performing automatic book classification the automation of total system can be achieved. 6. The efficient international information transfer will be advanced since all the institution maintain unified classification number.

  • PDF

Local Linear Logistic Classification of Microarray Data Using Orthogonal Components (직교요인을 이용한 국소선형 로지스틱 마이크로어레이 자료의 판별분석)

  • Baek, Jang-Sun;Son, Young-Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.587-598
    • /
    • 2006
  • The number of variables exceeds the number of samples in microarray data. We propose a nonparametric local linear logistic classification procedure using orthogonal components for classifying high-dimensional microarray data. The proposed method is based on the local likelihood and can be applied to multi-class classification. We applied the local linear logistic classification method using PCA, PLS, and factor analysis components as new features to Leukemia data and colon data, and compare the performance of the proposed method with the conventional statistical classification procedures. The proposed method outperforms the conventional ones for each component, and PLS has shown best performance when it is embedded in the proposed method among the three orthogonal components.

MONITORING OF MOUNTAINOUS AREAS USING SIMULATED IMAGES TO KOMPSAT-II

  • Chang Eun-Mi;Shin Soo-Hyun
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.653-655
    • /
    • 2005
  • More than 70 percent of terrestrial territory of Korea is mountainous areas where degradation becomes serious year by year due to illegal tombs, expanding golf courses and stone mine development. We elaborate the potential usage of high resolution image for the monitoring of the phenomena. We made the classification of tombs and the statistical radiometric characteristics of graves were identified from this project. The graves could be classified to 4 groups from the field survey. As compared with grouping data after clustering and discriminant analysis, the two results coincided with each other. Object-oriented classification algorithm for feature extraction was theoretically researched in this project. And we did a pilot project, which was performed with mixed methods. That is, the conventional methods such as unsupervised and supervised classification were mixed up with the new method for feature extraction, object-oriented classification method. This methodology showed about $60\%$ classification accuracy for extracting tombs from satellite imagery. The extraction of tombs' geographical coordinates and graves themselves from satellite image was performed in this project. The stone mines and golf courses are extracted by NDVI and GVI. The accuracy of classification was around 89 percent. The location accuracy showed extraction of tombs from one-meter resolution image is cheaper and quicker way than GPS method. Finally we interviewed local government officers and made analyses on the current situation of mountainous area management and potential usage of KOMPSAT-II images. Based on the requirement analysis, we developed software, which is to management and monitoring system for mountainous area for local government.

  • PDF

An Analysis of the Application Framework of the Business Reference Model to Records Classification Schemes in Korean Central Government Agencies (기록분류를 위한 정부기능분류체계의 적용 구조 및 운용 분석 - 중앙행정기관을 중심으로 -)

  • Seol, Moon-Won
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.24 no.4
    • /
    • pp.23-51
    • /
    • 2013
  • The purpose of the study is to examine the potentialities and limits of Business Reference Model (BRM) as records classification schemes in Korean central state institutions. The analysis is based on the data collected through focus group interviews of three times, in which six records professionals from central government agencies participate. This paper begins with inquiring the framework of records classification based BRM, required by Public Records Management Act. It explores the types of benefit of BRM application to government records classification. Based on the collected data from the interviews, it investigates how records are aggregated, and how transaction level (Danwi-Gwaje) of BRM is applied in the course of records aggregation.

A new classification method using penalized partial least squares (벌점 부분최소자승법을 이용한 분류방법)

  • Kim, Yun-Dae;Jun, Chi-Hyuck;Lee, Hye-Seon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.931-940
    • /
    • 2011
  • Classification is to generate a rule of classifying objects into several categories based on the learning sample. Good classification model should classify new objects with low misclassification error. Many types of classification methods have been developed including logistic regression, discriminant analysis and tree. This paper presents a new classification method using penalized partial least squares. Penalized partial least squares can make the model more robust and remedy multicollinearity problem. This paper compares the proposed method with logistic regression and PCA based discriminant analysis by some real and artificial data. It is concluded that the new method has better power as compared with other methods.

A Study on Efficient Topography Classification of High Resolution Satelite Image (고해상도 위성영상의 효율적 지형분류기법 연구)

  • Lim, Hye-Young;Kim, Hwang-Soo;Choi, Joon-Seog;Song, Seung-Ho
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.13 no.3 s.33
    • /
    • pp.33-40
    • /
    • 2005
  • The aim of remotely sensed data classification is to produce the best accuracy map of the earth surface assigning each pixel to its appropriate category of the real-world. The classification of satellite multi-spectral image data has become tool for generating ground cover map. Many classification methods exist. In this study, MLC(Maximum Likelihood Classification), ANN(Artificial neural network), SVM(Support Vector Machine), Naive Bayes classifier algorithms are compared using IKONOS image of the part of Dalsung Gun, Daegu area. Two preprocessing methods are performed-PCA(Principal component analysis), ICA(Independent Component Analysis). Boosting algorithms also performed. By the combination of appropriate feature selection pre-processing and classifier, the best results were obtained.

  • PDF

Classification of Korean Traditional Musical Instruments Using Feature Functions and k-nearest Neighbor Algorithm (특성함수 및 k-최근접이웃 알고리즘을 이용한 국악기 분류)

  • Kim Seok-Ho;Kwak Kyung-Sup;Kim Jae-Chun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.3
    • /
    • pp.279-286
    • /
    • 2006
  • Classification method used in this paper is applied for the first time to Korean traditional music. Among the frequency distribution vectors, average peak value is suggested and proved effective comparing to previous classification success rate. Mean, variance, spectral centroid, average peak value and ZCR are used to classify Korean traditional musical instruments. To achieve Korean traditional instruments automatic classification, Spectral analysis is used. For the spectral domain, Various functions are introduced to extract features from the data files. k-NN classification algorithm is applied to experiments. Taegum, gayagum and violin are classified in accuracy of 94.44% which is higher than previous success rate 87%.

  • PDF