• 제목/요약/키워드: K-Means clustering algorithm

Search Result 548, Processing Time 0.02 seconds

Development of newly recruited privates on-the-job Training Achievements Group Classification Model (신병 주특기교육 성취집단 예측모형 개발)

  • Kwak, Ki-Hyo;Suh, Yong-Moo
    • Journal of the military operations research society of Korea
    • /
    • v.33 no.2
    • /
    • pp.101-113
    • /
    • 2007
  • The period of military personnel service will be phased down by 2014 according to 'The law of National Defense Reformation' issued by the Ministry of National Defense. For this reason, the ROK army provides discrimination education to 'newly recruited privates' for more effective individual performance in the on-the-job training. For the training to be more effective, it would be essential to predict the degree of achievements by new privates in the training. Thus, we used data mining techniques to develop a classification model which classifies the new privates into one of two achievements groups, so that different skills of education are applied to each group. The target variable for this model is a binary variable, whose value can be either 'a group of general control' or 'a group of special control'. We developed four pure classification models using Neural Network, Decision Tree, Support Vector Machine and Naive Bayesian. We also built four hybrid models, each of which combines k-means clustering algorithm with one of these four mining technique. Experimental results demonstrated that the highest performance model was the hybrid model of k-means and Neural Network. We expect that various military education programs could be supported by these classification models for better educational performance.

Design of Optimized Radial Basis Function Neural Networks Classifier with the Aid of Principal Component Analysis and Linear Discriminant Analysis (주성분 분석법과 선형판별 분석법을 이용한 최적화된 방사형 기저 함수 신경회로망 분류기의 설계)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.735-740
    • /
    • 2012
  • In this paper, we introduce design methodologies of polynomial radial basis function neural network classifier with the aid of Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA). By minimizing the information loss of given data, Feature data is obtained through preprocessing of PCA and LDA and then this data is used as input data of RBFNNs. The hidden layer of RBFNNs is built up by Fuzzy C-Mean(FCM) clustering algorithm instead of receptive fields and linear polynomial function is used as connection weights between hidden and output layer. In order to design optimized classifier, the structural and parametric values such as the number of eigenvectors of PCA and LDA, and fuzzification coefficient of FCM algorithm are optimized by Artificial Bee Colony(ABC) optimization algorithm. The proposed classifier is applied to some machine learning datasets and its result is compared with some other classifiers.

The Redundancy Reduction Using Fuzzy C-means Clustering and Cosine Similarity on a Very Large Gas Sensor Array for Mimicking Biological Olfaction (생물학적 후각 시스템을 모방한 대규모 가스 센서 어레이에서 코사인 유사도와 퍼지 클러스터링을 이용한 중복도 제거 방법)

  • Kim, Jeong-Do;Kim, Jung-Ju;Park, Sung-Dae;Byun, Hyung-Gi;Persaud, K.C.;Lim, Seung-Ju
    • Journal of Sensor Science and Technology
    • /
    • v.21 no.1
    • /
    • pp.59-67
    • /
    • 2012
  • It was reported that the latest sensor technology allow an 65536 conductive polymer sensor array to be made with broad but overlapping selectivity to different families of chemicals emulating the characteristics found in biological olfaction. However, the supernumerary redundancy always accompanies great error and risk as well as an inordinate amount of computation time and local minima in signal processing, e.g. neural networks. In this paper, we propose a new method to reduce the number of sensor for analysis by reducing redundancy between sensors and by removing unstable sensors using the cosine similarity method and to decide on representative sensor using FCM(Fuzzy C-Means) algorithm. The representative sensors can be just used in analyzing. And, we introduce DWT(Discrete Wavelet Transform) for data compression in the time domain as preprocessing. Throughout experimental trials, we have done a comparative analysis between gas sensor data with and without reduced redundancy. The possibility and superiority of the proposed methods are confirmed through experiments.

Mobile Application based on Image Processing and a Proportion for Food Intake Measuring

  • Kim, Do-Hyeon;Kim, Yoon;Han, Yu-Ri
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.5
    • /
    • pp.57-63
    • /
    • 2017
  • In the paper, we propose a new reliable technique for measuring food intake based on image automatically without user intervention. First, food and bowl image before and after meal is obtained by user. The food and the bowl are divided into each region by the K-means clustering, Otsu algorithm, Morphology, etc. And the volume of food is measured by a proportional expression based on the information of the container such as it's entrance diameter, depth, and bottom diameter. Finally, our method calculates the volume of the consumed food by the difference between before and after meal. The proposed technique has higher accuracy than existing method for measuring food intake automatically. The experiment result shows that the average error rate is up to 7% for three types of containers. Computer simulation results indicate that the proposed algorithm is a convenient and accurate method of measuring the food intake.

Implementation of Elbow Method to improve the Gases Classification Performance based on the RBFN-NSG Algorithm

  • Jeon, Jin-Young;Choi, Jang-Sik;Byun, Hyung-Gi
    • Journal of Sensor Science and Technology
    • /
    • v.25 no.6
    • /
    • pp.431-434
    • /
    • 2016
  • Currently, the radial basis function network (RBFN) and various other neural networks are employed to classify gases using chemical sensors arrays, and their performance is steadily improving. In particular, the identification performance of the RBFN algorithm is being improved by optimizing parameters such as the center, width, and weight, and improved algorithms such as the radial basis function network-stochastic gradient (RBFN-SG) and radial basis function network-normalized stochastic gradient (RBFN-NSG) have been announced. In this study, we optimized the number of centers, which is one of the parameters of the RBFN-NSG algorithm, and observed the change in the identification performance. For the experiment, repeated measurement data of 8 samples were used, and the elbow method was applied to determine the optimal number of centers for each sample of input data. The experiment was carried out in two cases(the only one center per sample and the optimal number of centers obtained by elbow method), and the experimental results were compared using the mean square error (MSE). From the results of the experiments, we observed that the case having an optimal number of centers, obtained using the elbow method, showed a better identification performance than that without any optimization.

Optimization of Fuzzy Learning Machine by Using Particle Swarm Optimization (PSO 알고리즘을 이용한 퍼지 Extreme Learning Machine 최적화)

  • Roh, Seok-Beom;Wang, Jihong;Kim, Yong-Soo;Ahn, Tae-Chon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.1
    • /
    • pp.87-92
    • /
    • 2016
  • In this paper, optimization technique such as particle swarm optimization was used to optimize the parameters of fuzzy Extreme Learning Machine. While the learning speed of conventional neural networks is very slow, that of Extreme Learning Machine is very fast. Fuzzy Extreme Learning Machine is composed of the Extreme Learning Machine with very fast learning speed and fuzzy logic which can represent the linguistic information of the field experts. The general sigmoid function is used for the activation function of Extreme Learning Machine. However, the activation function of Fuzzy Extreme Learning Machine is the membership function which is defined in the procedure of fuzzy C-Means clustering algorithm. We optimize the parameters of the membership functions by using optimization technique such as Particle Swarm Optimization. In order to validate the classification capability of the proposed classifier, we make several experiments with the various machine learning datas.

Design of Fingerprints Identification Based on RBFNN Using Image Processing Techniques (영상처리 기법을 통한 RBFNN 패턴 분류기 기반 개선된 지문인식 시스템 설계)

  • Bae, Jong-Soo;Oh, Sung-Kwun;Kim, Hyun-Ki
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.6
    • /
    • pp.1060-1069
    • /
    • 2016
  • In this paper, we introduce the fingerprint recognition system based on Radial Basis Function Neural Network(RBFNN). Fingerprints are classified as four types(Whole, Arch, Right roof, Left roof). The preprocessing methods such as fast fourier transform, normalization, calculation of ridge's direction, filtering with gabor filter, binarization and rotation algorithm, are used in order to extract the features on fingerprint images and then those features are considered as the inputs of the network. RBFNN uses Fuzzy C-Means(FCM) clustering in the hidden layer and polynomial functions such as linear, quadratic, and modified quadratic are defined as connection weights of the network. Particle Swarm Optimization (PSO) algorithm optimizes a number of essential parameters needed to improve the accuracy of RBFNN. Those optimized parameters include the number of clusters and the fuzzification coefficient used in the FCM algorithm, and the orders of polynomial of networks. The performance evaluation of the proposed fingerprint recognition system is illustrated with the use of fingerprint data sets that are collected through Anguli program.

Fault Diagnosis for Rotating Machine Using Feature Extraction and Minimum Detection Error Algorithm (특징 추출과 검출 오차 최소화 알고리듬을 이용한 회전기계의 결함 진단)

  • Chong, Ui-pil;Cho, Sang-jin;Lee, Jae-yeal
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.16 no.1 s.106
    • /
    • pp.27-33
    • /
    • 2006
  • Fault diagnosis and condition monitoring for rotating machines are important for efficiency and accident prevention. The process of fault diagnosis is to extract the feature of signals and to classify each state. Conventionally, fault diagnosis has been developed by combining signal processing techniques for spectral analysis and pattern recognition, however these methods are not able to diagnose correctly for certain rotating machines and some faulty phenomena. In this paper, we add a minimum detection error algorithm to the previous method to reduce detection error rate. Vibration signals of the induction motor are measured and divided into subband signals. Each subband signal is processed to obtain the RMS, standard deviation and the statistic data for constructing the feature extraction vectors. We make a study of the fault diagnosis system that the feature extraction vectors are applied to K-means clustering algorithm and minimum detection error algorithm.

Voice Activity Detection Algorithm base on Radial Basis Function Networks with Dual Threshold (Radial Basis Function Networks를 이용한 이중 임계값 방식의 음성구간 검출기)

  • Kim Hong lk;Park Sung Kwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1660-1668
    • /
    • 2004
  • This paper proposes a Voice Activity Detection (VAD) algorithm based on Radial Basis Function (RBF) network using dual threshold. The k-means clustering and Least Mean Square (LMS) algorithm are used to upade the RBF network to the underlying speech condition. The inputs for RBF are the three parameters in a Code Exited Linear Prediction (CELP) coder, which works stably under various background noise levels. Dual hangover threshold applies in BRF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental result show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.

Identification of Heterogeneous Prognostic Genes and Prediction of Cancer Outcome using PageRank (페이지랭크를 이용한 암환자의 이질적인 예후 유전자 식별 및 예후 예측)

  • Choi, Jonghwan;Ahn, Jaegyoon
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.61-68
    • /
    • 2018
  • The identification of genes that contribute to the prediction of prognosis in patients with cancer is one of the challenges in providing appropriate therapies. To find the prognostic genes, several classification models using gene expression data have been proposed. However, the prediction accuracy of cancer prognosis is limited due to the heterogeneity of cancer. In this paper, we integrate microarray data with biological network data using a modified PageRank algorithm to identify prognostic genes. We also predict the prognosis of patients with 6 cancer types (including breast carcinoma) using the K-Nearest Neighbor algorithm. Before we apply the modified PageRank, we separate samples by K-Means clustering to address the heterogeneity of cancer. The proposed algorithm showed better performance than traditional algorithms for prognosis. We were also able to identify cluster-specific biological processes using GO enrichment analysis.