• Title/Summary/Keyword: Improved K-means algorithm

Search Result 143, Processing Time 0.022 seconds

A Study of Similar Blog Recommendation System Using Termite Colony Algorithm (흰개미 군집 알고리즘을 이용한 유사 블로그 추천 시스템에 관한 연구)

  • Jeong, Gi Sung;Jo, I-Seok;Lee, Malrey
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.83-88
    • /
    • 2013
  • This paper proposes a recommending system of the similar blogs gathered with similarities between blogs according to the similarity, dividing words, for each frequency, that individual blogs have. It improved the algorithm of k-means, using the model of the habits of white ants for better performance of clustering, and showed better performance of clustering as a result of evaluating and comparing with the existing algorithm of k-means as the improved algorithm. The recommending system of similar blog was designed and embodied, using the improved algorithm. TCA can reduce clustering time and the number of moving time for clustering compare with K-means algorithm.

An Implementation of K-Means Algorithm Improving Cluster Centroids Decision Methodologies (클러스터 중심 결정 방법을 개선한 K-Means 알고리즘의 구현)

  • Lee Shin-Won;Oh HyungJin;An Dong-Un;Jeong Seong-Jong
    • The KIPS Transactions:PartB
    • /
    • v.11B no.7 s.96
    • /
    • pp.867-874
    • /
    • 2004
  • K-Means algorithm is a non-hierarchical (plat) and reassignment techniques and iterates algorithm steps on the basis of K cluster centroids until the clustering results converge into K clusters. In its nature, K-Means algorithm has characteristics which make different results depending on the initial and new centroids. In this paper, we propose the modified K-Means algorithm which improves the initial and new centroids decision methodologies. By evaluating the performance of two algorithms using the 16 weighting scheme of SMART system, the modified algorithm showed $20{\%}$ better results on recall and F-measure than those of K-Means algorithm, and the document clustering results are quite improved.

An Improved K-means Document Clustering using Concept Vectors

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.853-861
    • /
    • 2003
  • An improved K-means document clustering method has been presented, where a concept vector is manipulated for each cluster on the basis of cosine similarity of text documents. The concept vectors are unit vectors that have been normalized on the n-dimensional sphere. Because the standard K-means method is sensitive to initial starting condition, our improvement focused on starting condition for estimating the modes of a distribution. The improved K-means clustering algorithm has been applied to a set of text documents, called Classic3, to test and prove efficiency and correctness of clustering result, and showed 7% improvements in its worst case.

  • PDF

An Improved Automated Spectral Clustering Algorithm

  • Xiaodan Lv
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.185-199
    • /
    • 2024
  • In this paper, an improved automated spectral clustering (IASC) algorithm is proposed to address the limitations of the traditional spectral clustering (TSC) algorithm, particularly its inability to automatically determine the number of clusters. Firstly, a cluster number evaluation factor based on the optimal clustering principle is proposed. By iterating through different k values, the value corresponding to the largest evaluation factor was selected as the first-rank number of clusters. Secondly, the IASC algorithm adopts a density-sensitive distance to measure the similarity between the sample points. This rendered a high similarity to the data distributed in the same high-density area. Thirdly, to improve clustering accuracy, the IASC algorithm uses the cosine angle classification method instead of K-means to classify the eigenvectors. Six algorithms-K-means, fuzzy C-means, TSC, EIGENGAP, DBSCAN, and density peak-were compared with the proposed algorithm on six datasets. The results show that the IASC algorithm not only automatically determines the number of clusters but also obtains better clustering accuracy on both synthetic and UCI datasets.

User's Individuality Preference Recommendation System using Improved k-means Algorithm (개선된 k-means 알고리즘을 적용한 사용자 특성 선호도 추천 시스템)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.8
    • /
    • pp.141-148
    • /
    • 2010
  • In mobile terminal recommend service system has general information restrictive recommend that individuality considering to user's information find and recommend. Also it has difficult of accurate information recommend bad points user's not offer individuality information preference recommend service. Therefore this paper is propose user's information individuality preference considering by user's individuality preference recommendation system using improved k-means algorithm. Propose method is correlation coefficients using user's information individuality preference when user's individuality preference recommendation using improved k-means algorithm. Restrictive information recommend to fix a problem, information of restrictive general recommend that user's information individuality preference offer to accurate information recommend. Performance experiment is existing service system as compared to evaluating the effectiveness of precision and recall, performance experiment result is appear to precision 85%, recall 68%.

Initial Mode Decision Method for Clustering in Categorical Data

  • Yang, Soon-Cheol;Kang, Hyung-Chang;Kim, Chul-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.481-488
    • /
    • 2007
  • The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. The k-modes algorithm is to extend the k-means paradigm to categorical domains. The algorithm requires a pre-setting or random selection of initial points (modes) of the clusters. This paper improved the problem of k-modes algorithm, using the Max-Min method that is a kind of methods to decide initial values in k-means algorithm. we introduce new similarity measures to deal with using the categorical data for clustering. We show that the mushroom data sets and soybean data sets tested with the proposed algorithm has shown a good performance for the two aspects(accuracy, run time).

  • PDF

An Improved Hybrid Canopy-Fuzzy C-Means Clustering Algorithm Based on MapReduce Model

  • Dai, Wei;Yu, Changjun;Jiang, Zilong
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2016
  • The fuzzy c-means (FCM) is a frequently utilized algorithm at present. Yet, the clustering quality and convergence rate of FCM are determined by the initial cluster centers, and so an improved FCM algorithm based on canopy cluster concept to quickly analyze the dataset has been proposed. Taking advantage of the canopy algorithm for its rapid acquisition of cluster centers, this algorithm regards the cluster results of canopy as the input. In this way, the convergence rate of the FCM algorithm is accelerated. Meanwhile, the MapReduce scheme of the proposed FCM algorithm is designed in a cloud environment. Experimental results demonstrate the hybrid canopy-FCM clustering algorithm processed by MapReduce be endowed with better clustering quality and higher operation speed.

Improved Nonlocal Means Algorithm for Image Denoising (영상 잡음 제거를 위해 개선된 비지역적 평균 알고리즘)

  • Park, Sang-Wook;Kang, Moon-Gi
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.1
    • /
    • pp.46-53
    • /
    • 2011
  • Nonlocal means denoising algorithm is one of the most widely used denoising algorithm. Because it performs well, and the theoretic idea is intuitive and simple. However the conventional nonlocal means algorithm has still some problems such as noise remaining in the denoised flat region and blurring artifacts in the denoised edge and pattern region. Thus many improved algorithms based on nonlocal means have been proposed. In this paper, we proposed new improved nonlocal means denoising algorithm by weight update through weights sorting and newly defined threshold. Updated weights can make weights more refined and definite, and denoising is possible without that artifacts. Experimental results including comparisons with conventional algorithms for various noise levels and test images show the proposed algorithm has a good performance in both visual and quantitative criteria.

Selection of Cluster Hierarchy Depth and Initial Centroids in Hierarchical Clustering using K-Means Algorithm (K-Means 알고리즘을 이용한 계층적 클러스터링에서 클러스터 계층 깊이와 초기값 선정)

  • Lee, Shin-Won;An, Dong-Un;Chong, Sung-Jong
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.4 s.54
    • /
    • pp.173-185
    • /
    • 2004
  • Fast and high-quality document clustering algorithms play an important role in providing data exploration by organizing large amounts of information into a small number of meaningful clusters. Many papers have shown that the hierarchical clustering method takes good-performance, but is limited because of its quadratic time complexity. In contrast, with a large number of variables, K-means has a time complexity that is linear in the number of documents, but is thought to produce inferior clusters. In this paper, Condor system using K-Means algorithm Compares with regular method that the initial centroids have been established in advance, our method performance has been improved a lot.

K-Means-Based Polynomial-Radial Basis Function Neural Network Using Space Search Algorithm: Design and Comparative Studies (공간 탐색 최적화 알고리즘을 이용한 K-Means 클러스터링 기반 다항식 방사형 기저 함수 신경회로망: 설계 및 비교 해석)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.8
    • /
    • pp.731-738
    • /
    • 2011
  • In this paper, we introduce an advanced architecture of K-Means clustering-based polynomial Radial Basis Function Neural Networks (p-RBFNNs) designed with the aid of SSOA (Space Search Optimization Algorithm) and develop a comprehensive design methodology supporting their construction. In order to design the optimized p-RBFNNs, a center value of each receptive field is determined by running the K-Means clustering algorithm and then the center value and the width of the corresponding receptive field are optimized through SSOA. The connections (weights) of the proposed p-RBFNNs are of functional character and are realized by considering three types of polynomials. In addition, a WLSE (Weighted Least Square Estimation) is used to estimate the coefficients of polynomials (serving as functional connections of the network) of each node from output node. Therefore, a local learning capability and an interpretability of the proposed model are improved. The proposed model is illustrated with the use of nonlinear function, NOx called Machine Learning dataset. A comparative analysis reveals that the proposed model exhibits higher accuracy and superb predictive capability in comparison to some previous models available in the literature.