• Title/Summary/Keyword: Classifier Clustering

Search Result 136, Processing Time 0.022 seconds

Context-Aware Fusion with Support Vector Machine (Support Vector Machine을 이용한 문맥 인지형 융합)

  • Heo, Gyeong-Yong;Kim, Seong-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.6
    • /
    • pp.19-26
    • /
    • 2014
  • An ensemble classifier system is a widely-used multi-classifier system, which combines the results from each classifier and, as a result, achieves better classification result than any single classifier used. Several methods have been used to build an ensemble classifier including boosting, which is a cascade method where misclassified examples in previous stage are used to boost the performance in current stage. Boosting is, however, a serial method which does not form a complete feedback loop. In this paper, proposed is context sensitive SVM ensemble (CASE) which adopts SVM, one of the best classifiers in term of classification rate, as a basic classifier and clustering method to divide feature space into contexts. As CASE divides feature space and trains SVMs simultaneously, the result from one component can be applied to the other and CASE achieves better result than boosting. Experimental results prove the usefulness of the proposed method.

Fine-Grained Mobile Application Clustering Model Using Retrofitted Document Embedding

  • Yoon, Yeo-Chan;Lee, Junwoo;Park, So-Young;Lee, Changki
    • ETRI Journal
    • /
    • v.39 no.4
    • /
    • pp.443-454
    • /
    • 2017
  • In this paper, we propose a fine-grained mobile application clustering model using retrofitted document embedding. To automatically determine the clusters and their numbers with no predefined categories, the proposed model initializes the clusters based on title keywords and then merges similar clusters. For improved clustering performance, the proposed model distinguishes between an accurate clustering step with titles and an expansive clustering step with descriptions. During the accurate clustering step, an automatically tagged set is constructed as a result. This set is utilized to learn a high-performance document vector. During the expansive clustering step, more applications are then classified using this document vector. Experimental results showed that the purity of the proposed model increased by 0.19, and the entropy decreased by 1.18, compared with the K-means algorithm. In addition, the mean average precision improved by more than 0.09 in a comparison with a support vector machine classifier.

Design of Partial Discharge Pattern Classifier of Softmax Neural Networks Based on K-means Clustering : Comparative Studies and Analysis of Classifier Architecture (K-means 클러스터링 기반 소프트맥스 신경회로망 부분방전 패턴분류의 설계 : 분류기 구조의 비교연구 및 해석)

  • Jeong, Byeong-Jin;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.1
    • /
    • pp.114-123
    • /
    • 2018
  • This paper concerns a design and learning method of softmax function neural networks based on K-means clustering. The partial discharge data Information is preliminarily processed through simulation using an Epoxy Mica Coupling sensor and an internal Phase Resolved Partial Discharge Analysis algorithm. The obtained information is processed according to the characteristics of the pattern using a Motor Insulation Monitoring System program. At this time, the processed data are total 4 types that void discharge, corona discharge, surface discharge and slot discharge. The partial discharge data with high dimensional input variables are secondarily processed by principal component analysis method and reduced with keeping the characteristics of pattern as low dimensional input variables. And therefore, the pattern classifier processing speed exhibits improved effects. In addition, in the process of extracting the partial discharge data through the MIMS program, the magnitude of amplitude is divided into the maximum value and the average value, and two pattern characteristics are set and compared and analyzed. In the first half of the proposed partial discharge pattern classifier, the input and hidden layers are classified by using the K-means clustering method and the output of the hidden layer is obtained. In the latter part, the cross entropy error function is used for parameter learning between the hidden layer and the output layer. The final output layer is output as a normalized probability value between 0 and 1 using the softmax function. The advantage of using the softmax function is that it allows access and application of multiple class problems and stochastic interpretation. First of all, there is an advantage that one output value affects the remaining output value and its accompanying learning is accelerated. Also, to solve the overfitting problem, L2-normalization is applied. To prove the superiority of the proposed pattern classifier, we compare and analyze the classification rate with conventional radial basis function neural networks.

Identification of Plastic Wastes by Using Fuzzy Radial Basis Function Neural Networks Classifier with Conditional Fuzzy C-Means Clustering

  • Roh, Seok-Beom;Oh, Sung-Kwun
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.6
    • /
    • pp.1872-1879
    • /
    • 2016
  • The techniques to recycle and reuse plastics attract public attention. These public attraction and needs result in improving the recycling technique. However, the identification technique for black plastic wastes still have big problem that the spectrum extracted from near infrared radiation spectroscopy is not clear and is contaminated by noise. To overcome this problem, we apply Raman spectroscopy to extract a clear spectrum of plastic material. In addition, to improve the classification ability of fuzzy Radial Basis Function Neural Networks, we apply supervised learning based clustering method instead of unsupervised clustering method. The conditional fuzzy C-Means clustering method, which is a kind of supervised learning based clustering algorithms, is used to determine the location of radial basis functions. The conditional fuzzy C-Means clustering analyzes the data distribution over input space under the supervision of auxiliary information. The auxiliary information is defined by using k Nearest Neighbor approach.

Design of Black Plastics Classifier Using Data Information (데이터 정보를 이용한 흑색 플라스틱 분류기 설계)

  • Park, Sang-Beom;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.4
    • /
    • pp.569-577
    • /
    • 2018
  • In this paper, with the aid of information which is included within data, preprocessing algorithm-based black plastic classifier is designed. The slope and area of spectrum obtained by using laser induced breakdown spectroscopy(LIBS) are analyzed for each material and its ensuing information is applied as the input data of the proposed classifier. The slope is represented by the rate of change of wavelength and intensity. Also, the area is calculated by the wavelength of the spectrum peak where the material property of chemical elements such as carbon and hydrogen appears. Using informations such as slope and area, input data of the proposed classifier is constructed. In the preprocessing part of the classifier, Principal Component Analysis(PCA) and fuzzy transform are used for dimensional reduction from high dimensional input variables to low dimensional input variables. Characteristic analysis of the materials as well as the processing speed of the classifier is improved. In the condition part, FCM clustering is applied and linear function is used as connection weight in the conclusion part. By means of Particle Swarm Optimization(PSO), parameters such as the number of clusters, fuzzification coefficient and the number of input variables are optimized. To demonstrate the superiority of classification performance, classification rate is compared by using WEKA 3.8 data mining software which contains various classifiers such as Naivebayes, SVM and Multilayer perceptron.

Review on Genetic Algorithms for Pattern Recognition (패턴 인식을 위한 유전 알고리즘의 개관)

  • Oh, Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.1
    • /
    • pp.58-64
    • /
    • 2007
  • In pattern recognition field, there are many optimization problems having exponential search spaces. To solve of sequential search algorithms seeking sub-optimal solutions have been used. The algorithms have limitations of stopping at local optimums. Recently lots of researches attempt to solve the problems using genetic algorithms. This paper explains the huge search spaces of typical problems such as feature selection, classifier ensemble selection, neural network pruning, and clustering, and it reviews the genetic algorithms for solving them. Additionally we present several subjects worthy of noting as future researches.

A Low Complexity PTS Technique using Threshold for PAPR Reduction in OFDM Systems

  • Lim, Dai Hwan;Rhee, Byung Ho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.9
    • /
    • pp.2191-2201
    • /
    • 2012
  • Traffic classification seeks to assign packet flows to an appropriate quality of service (QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.

Design of One-Class Classifier Using Hyper-Rectangles (Hyper-Rectangles를 이용한 단일 분류기 설계)

  • Jeong, In Kyo;Choi, Jin Young
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.41 no.5
    • /
    • pp.439-446
    • /
    • 2015
  • Recently, the importance of one-class classification problem is more increasing. However, most of existing algorithms have the limitation on providing the information that effects on the prediction of the target value. Motivated by this remark, in this paper, we suggest an efficient one-class classifier using hyper-rectangles (H-RTGLs) that can be produced from intervals including observations. Specifically, we generate intervals for each feature and integrate them. For generating intervals, we consider two approaches : (i) interval merging and (ii) clustering. We evaluate the performance of the suggested methods by computing classification accuracy using area under the roc curve and compare them with other one-class classification algorithms using four datasets from UCI repository. Since H-RTGLs constructed for a given data set enable classification factors to be visible, we can discern which features effect on the classification result and extract patterns that a data set originally has.

Text-independent Speaker Identification Using Soft Bag-of-Words Feature Representation

  • Jiang, Shuangshuang;Frigui, Hichem;Calhoun, Aaron W.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.240-248
    • /
    • 2014
  • We present a robust speaker identification algorithm that uses novel features based on soft bag-of-word representation and a simple Naive Bayes classifier. The bag-of-words (BoW) based histogram feature descriptor is typically constructed by summarizing and identifying representative prototypes from low-level spectral features extracted from training data. In this paper, we define a generalization of the standard BoW. In particular, we define three types of BoW that are based on crisp voting, fuzzy memberships, and possibilistic memberships. We analyze our mapping with three common classifiers: Naive Bayes classifier (NB); K-nearest neighbor classifier (KNN); and support vector machines (SVM). The proposed algorithms are evaluated using large datasets that simulate medical crises. We show that the proposed soft bag-of-words feature representation approach achieves a significant improvement when compared to the state-of-art methods.

Sales Forecasting Model for Apparel Products Using Machine Learning Technique - A Case Study on Forecasting Outerwear Items - (머신 러닝을 활용한 의류제품의 판매량 예측 모델 - 아우터웨어 품목을 중심으로 -)

  • Chae, Jin Mie;Kim, Eun Hie
    • Fashion & Textile Research Journal
    • /
    • v.23 no.4
    • /
    • pp.480-490
    • /
    • 2021
  • Sales forecasting is crucial for many retail operations. For apparel retailers, accurate sales forecast for the next season is critical to properly manage inventory and plan their supply chains. The challenge in this increases because apparel products are always new for the next season, have numerous variations, short life cycles, long lead times, and seasonal trends. In this study, a sales forecasting model is proposed for apparel products using machine learning techniques. The sales data pertaining to outerwear items for four years were collected from a Korean sports brand and filtered with outliers. Subsequently, the data were standardized by removing the effects of exogenous variables. The sales patterns of outerwear items were clustered by applying K-means clustering, and outerwear attributes associated with the specific sales-pattern type were determined by using a decision tree classifier. Six types of sales pattern clusters were derived and classified using a hybrid model of clustering and decision tree algorithm, and finally, the relationship between outerwear attributes and sales patterns was revealed. Each sales pattern can be used to predict stock-keeping-unit-level sales based on item attributes.