• Title/Summary/Keyword: K-means++ algorithm

Search Result 1,363, Processing Time 0.028 seconds

Korean Onomatopoeia Clustering for Sound Database (음향 DB 구축을 위한 한국어 의성어 군집화)

  • Kim, Myung-Gwan;Shin, Young-Suk;Kim, Young-Rye
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.9
    • /
    • pp.1195-1203
    • /
    • 2008
  • Onomatopoeia of korean documents is to represent from natural or artificial sound to human language and it can express onomatopoeia language which is the nearest an object and also able to utilize as standard for clustering of Multimedia data. In this study, We get frequency of onomatopoeia in the experiment subject and select 100 onomatopoeia of use to our study In order to cluster onomatopoeia's relation, we extract feature of similarity and distance metric and then represent onomatopoeia's relation on vector space by using PCA. At the end, we can clustering onomatopoeia by using k-means algorithm.

  • PDF

Cotent-based Image Retrieving Using Color Histogram and Color Texture (컬러 히스토그램과 컬러 텍스처를 이용한 내용기반 영상 검색 기법)

  • Lee, Hyung-Goo;Yun, Il-Dong
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.9
    • /
    • pp.76-90
    • /
    • 1999
  • In this paper, a color image retrieval algorithm is proposed based on color histogram and color texture. The representative color vectors of a color image are made from k-means clustering of its color histogram, and color texture is generated by centering around the color of pixels with its color vector. Thus the color texture means texture properties emphasized by its color histogram, and it is analyzed by Gaussian Markov Random Field (GMRF) model. The proposed algorithm can work efficiently because it does not require any low level image processing such as segmentation or edge detection, so it outperforms the traditional algorithms which use color histogram only or texture properties come from image intensity.

  • PDF

Preprocessing Effect by Using k-means Clustering and Merging .Algorithms in MR Cardiac Left Ventricle Segmentation (자기공명 심장 영상의 좌심실 경계추출에서의 k 평균 군집화와 병합 알고리즘의 사용으로 인한 전처리 효과)

  • Ik-Hwan Cho;Jung-Su Oh;Kyong-Sik Om;In-Chan Song;Kee-Hyun Chang;Dong-Seok Jeong
    • Journal of Biomedical Engineering Research
    • /
    • v.24 no.2
    • /
    • pp.55-60
    • /
    • 2003
  • For quantitative analysis of the cardiac diseases. it is necessary to segment the left-ventricle (LY) in MR (Magnetic Resonance) cardiac images. Snake or active contour model has been used to segment LV boundary. However, the contour of the LV front these models may not converge to the desirable one because the contour may fall into local minimum value due to image artifact inside of the LY Therefore, in this paper, we Propose the Preprocessing method using k-means clustering and merging algorithms that can improve the performance of the active contour model. We verified that our proposed algorithm overcomes local minimum convergence problem by experiment results.

The Design of Optimal Fuzzy-Neural networks Structure by Means of GA and an Aggregate Weighted Performance Index (유전자 알고리즘과 합성 성능지수에 의한 최적 퍼지-뉴럴 네트워크 구조의 설계)

  • Oh, Sung-Kwun;Yoon, Ki-Chan;Kim, Hyun-Ki
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.6 no.3
    • /
    • pp.273-283
    • /
    • 2000
  • In this paper we suggest an optimal design method of Fuzzy-Neural Networks(FNN) model for complex and nonlinear systems. The FNNs use the simplified inference as fuzzy inference method and Error Back Propagation Algorithm as learning rule. And we use a HCM(Hard C-Means) Clustering Algorithm to find initial parameters of the membership function. The parameters such as parameters of membership functions learning rates and momentum weighted value is proposed to achieve a sound balance between approximation and generalization abilities of the model. According to selection and adjustment of a weighting factor of an aggregate objective function which depends on the number of data and a certain degree of nonlinearity (distribution of I/O data we show that it is available and effective to design and optimal FNN model structure with a mutual balance and dependency between approximation and generalization abilities. This methodology sheds light on the role and impact of different parameters of the model on its performance (especially the mapping and predicting capabilities of the rule based computing). To evaluate the performance of the proposed model we use the time series data for gas furnace the data of sewage treatment process and traffic route choice process.

  • PDF

Face Recognition Based on PCA and LDA Combining Clustering (Clustering을 결합한 PCA와 LDA 기반 얼굴 인식)

  • Guo, Lian-Hua;Kim, Pyo-Jae;Chang, Hyung-Jin;Choi, Jin-Young
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.387-388
    • /
    • 2006
  • In this paper, we propose an efficient algorithm based on PCA and LDA combining K-means clustering method, which has better accuracy of face recognition than Eigenface and Fisherface. In this algorithm, PCA is firstly used to reduce the dimensionality of original face image. Secondly, a truncated face image data are sub-clustered by K-means clustering method based on Euclidean distances, and all small subclusters are labeled in sequence. Then LDA method project data into low dimension feature space and group data easier to classify. Finally we use nearest neighborhood method to determine the label of test data. To show the recognition accuracy of the proposed algorithm, we performed several simulations using the Yale and ORL (Olivetti Research Laboratory) database. Simulation results show that proposed method achieves better performance in recognition accuracy.

  • PDF

Recognition of Car License Plates using Morphological Information and SOM Algorithm

  • Lim, Eun-Kyung;Kim, Young-Ju;Kim, Dae-Su;Kwang-Baek, Kim
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.648-651
    • /
    • 2003
  • In this paper, we propose the recognition system of a license plate using SOM algorithm. The recognition of license plate was investigated by means of the SOM algorithm. The morphological information of horizontal and vertical edges was used to extract a plate region from a car image. In addition, the 4-direction contour tracking algorithm was applied to extract the specific area, which includes characters from an extracted plate area. Therefore, we proposed how to extract license plate region using morphological information and how to recognize the character string using SOM algorithm. In this paper, 50 car images were tested. The extraction rate obtained by the proposed extraction method showed better results than that from the color information of RGB and HSI, respectively. And the license plate recognition using SOM algorithm was very efficient.

  • PDF

Development of a Clustering Model for Automatic Knowledge Classification (지식 분류의 자동화를 위한 클러스터링 모형 연구)

  • 정영미;이재윤
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.203-230
    • /
    • 2001
  • The purpose of this study is to develop a document clustering model for automatic classification of knowledge. Two test collections of newspaper article texts and journal article abstracts are built for the clustering experiment. Various feature reduction criteria as well as term weighting methods are applied to the term sets of the test collections, and cosine and Jaccard coefficients are used as similarity measures. The performances of complete linkage and K-means clustering algorithms are compared using different feature selection methods and various term weights. It was found that complete linkage clustering outperforms K-means algorithm and feature reduction up to almost 10% of the total feature sets does not lower the performance of document clustering to any significant extent.

  • PDF

A Study on Static Situation Awareness System with the Aid of Optimized Polynomial Radial Basis Function Neural Networks (최적화된 pRBF 뉴럴 네트워크에 의한 정적 상황 인지 시스템에 관한 연구)

  • Oh, Sung-Kwun;Na, Hyun-Suk;Kim, Wook-Dong
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.12
    • /
    • pp.2352-2360
    • /
    • 2011
  • In this paper, we introduce a comprehensive design methodology of Radial Basis Function Neural Networks (RBFNN) that is based on mechanism of clustering and optimization algorithm. We can divide some clusters based on similarity of input dataset by using clustering algorithm. As a result, the number of clusters is equal to the number of nodes in the hidden layer. Moreover, the centers of each cluster are used into the centers of each receptive field in the hidden layer. In this study, we have applied Fuzzy-C Means(FCM) and K-Means(KM) clustering algorithm, respectively and compared between them. The weight connections of model are expanded into the type of polynomial functions such as linear and quadratic. In this reason, the output of model consists of relation between input and output. In order to get the optimal structure and better performance, Particle Swarm Optimization(PSO) is used. We can obtain optimized parameters such as both the number of clusters and the polynomial order of weights connection through structural optimization as well as the widths of receptive fields through parametric optimization. To evaluate the performance of proposed model, NXT equipment offered by National Instrument(NI) is exploited. The situation awareness system-related intelligent model was built up by the experimental dataset of distance information measured between object and diverse sensor such as sound sensor, light sensor, and ultrasonic sensor of NXT equipment.

An Efficient Method to Compute a Covariance Matrix of the Non-local Means Algorithm for Image Denoising with the Principal Component Analysis (영상 잡음 제거를 위한 주성분 분석 기반 비 지역적 평균 알고리즘의 효율적인 공분산 행렬 계산 방법)

  • Kim, Jeonghwan;Jeong, Jechang
    • Journal of Broadcast Engineering
    • /
    • v.21 no.1
    • /
    • pp.60-65
    • /
    • 2016
  • This paper introduces the non-local means (NLM) algorithm for image denoising, and also introduces an improved algorithm which is based on the principal component analysis (PCA). To do the PCA, a covariance matrix of a given image should be evaluated first. If we let the size of neighborhood patches of the NLM S × S2, and let the number of pixels Q, a matrix multiplication of the size S2 × Q is required to compute a covariance matrix. According to the characteristic of images, such computation is inefficient. Therefore, this paper proposes an efficient method to compute the covariance matrix by sampling the pixels. After sampling, the covariance matrix can be computed with matrices of the size S2 × floor (Width/l) × (Height/l).

Performance Analysis of User Clustering Algorithms against User Density and Maximum Number of Relays for D2D Advertisement Dissemination (최대 전송횟수 제한 및 사용자 밀집도 변화에 따른 사용자 클러스터링 알고리즘 별 D2D 광고 확산 성능 분석)

  • Han, Seho;Kim, Junseon;Lee, Howon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.4
    • /
    • pp.721-727
    • /
    • 2016
  • In this paper, in order to resolve the problem of reduction for D2D (device to device) advertisement dissemination efficiency of conventional dissemination algorithms, we here propose several clustering algorithms (modified single linkage algorithm (MSL), K-means algorithm, and expectation maximization algorithm with Gaussian mixture model (EM)) based advertisement dissemination algorithms to improve advertisement dissemination efficiency in D2D communication networks. Target areas are clustered in several target groups by the proposed clustering algorithms. Then, D2D advertisements are consecutively distributed by using a routing algorithm based on the geographical distribution of the target areas and a relay selection algorithm based on the distance between D2D sender and D2D receiver. Via intensive MATLAB simulations, we analyze the performance excellency of the proposed algorithms with respect to maximum number of relay transmissions and D2D user density ratio in a target area and a non-target area.