• Title/Summary/Keyword: K-Means clustering algorithm

Search Result 548, Processing Time 0.026 seconds

Privacy-Preserving k-means Clustering of Encrypted Data (암호화된 데이터에 대한 프라이버시를 보존하는 k-means 클러스터링 기법)

  • Jeong, Yunsong;Kim, Joon Sik;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.6
    • /
    • pp.1401-1414
    • /
    • 2018
  • The k-means clustering algorithm groups input data with the number of groups represented by variable k. In fact, this algorithm is particularly useful in market segmentation and medical research, suggesting its wide applicability. In this paper, we propose a privacy-preserving clustering algorithm that is appropriate for outsourced encrypted data, while exposing no information about the input data itself. Notably, our proposed model facilitates encryption of all data, which is a large advantage over existing privacy-preserving clustering algorithms which rely on multi-party computation over plaintext data stored on several servers. Our approach compares homomorphically encrypted ciphertexts to measure the distance between input data. Finally, we theoretically prove that our scheme guarantees the security of input data during computation, and also evaluate our communication and computation complexity in detail.

Repeated K-means Clustering Algorithm For Radar Sorting (레이더 군집화를 위한 반복 K-means 클러스터링 알고리즘)

  • Dong Hyun ParK;Dong-ho Seo;Jee-hyeon Baek;Won-jin Lee;Dong Eui Chang
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.26 no.5
    • /
    • pp.384-391
    • /
    • 2023
  • In modern electronic warfare, a number of radar emitters are in operation, causing radar receivers to receive high-density signal pulses that occur simultaneously. To analyze the radar signals more accurately and identify enemies, the sorting process of high-density radar signals is very important before analysis. Recently, machine learning algorithms, specifically K-means clustering, are the subject of research aimed at improving the accuracy of radar signal sorting. One of the challenges faced by these studies is that the clustering results can vary depending on how the initial points are selected and how many clusters number are set. This paper introduces a repeated K-means clustering algorithm that aims to accurately cluster all data by identifying and addressing false clusters in the radar sorting problem. To verify the performance of the proposed algorithm, experiments are conducted by applying it to simulated signals that are generated by a signal generator.

Nonlinear Characteristics of Fuzzy Scatter Partition-Based Fuzzy Inference System

  • Park, Keon-Jun;Huang, Wei;Yu, C.;Kim, Yong K.
    • International journal of advanced smart convergence
    • /
    • v.2 no.1
    • /
    • pp.12-17
    • /
    • 2013
  • This paper introduces the fuzzy scatter partition-based fuzzy inference system to construct the model for nonlinear process to analyze nonlinear characteristics. The fuzzy rules of fuzzy inference systems are generated by partitioning the input space in the scatter form using Fuzzy C-Means (FCM) clustering algorithm. The premise parameters of the rules are determined by membership matrix by means of FCM clustering algorithm. The consequence part of the rules is represented in the form of polynomial functions and the parameters of the consequence part are estimated by least square errors. The proposed model is evaluated with the performance using the data widely used in nonlinear process. Finally, this paper shows that the proposed model has the good result for high-dimension nonlinear process.

A Hybrid Genetic Algorithm for K-Means Clustering

  • Jun, Sung-Hae;Han, Jin-Woo;Park, Minjae;Oh, Kyung-Whan
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.330-333
    • /
    • 2003
  • Initial cluster size for clustering of partitioning methods is very important to the clustering result. In K-means algorithm, the result of cluster analysis becomes different with optimal cluster size K. Usually, the initial cluster size is determined by prior and subjective information. Sometimes this may not be optimal. Now, more objective method is needed to solve this problem. In our research, we propose a hybrid genetic algorithm, a tree induction based evolution algorithm, for determination of optimal cluster size. Initial population of this algorithm is determined by the number of terminal nodes of tree induction. From the initial population based on decision tree, our optimal cluster size is generated. The fitness function of ours is defined an inverse of dissimilarity measure. And the bagging approach is used for saying computational time cost.

  • PDF

Property-based Hierarchical Clustering of Peers using Mobile Agent for Unstructured P2P Systems (비구조화 P2P 시스템에서 이동에이전트를 이용한 Peer의 속성기반 계층적 클러스터링)

  • Salvo, MichaelAngelG.;Mateo, RomeoMarkA.;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.189-198
    • /
    • 2009
  • Unstructured peer-to-peer systems are most commonly used in today's internet. But file placement is random in these systems and no correlation exists between peers and their contents. There is no guarantee that flooding queries will find the desired data. In this paper, we propose to cluster nodes in unstructured P2P systems using the agglomerative hierarchical clustering algorithm to improve the search method. We compared the delay time of clustering the nodes between our proposed algorithm and the k-means clustering algorithm. We also simulated the delay time of locating data in a network topology and recorded the overhead of the system using our proposed algorithm, k-means clustering, and without clustering. Simulation results show that the delay time of our proposed algorithm is shorter compared to other methods and resource overhead is also reduced.

  • PDF

Normal Mixture Model with General Linear Regressive Restriction: Applied to Microarray Gene Clustering

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.205-213
    • /
    • 2007
  • In this paper, the normal mixture model subjected to general linear restriction for component-means based on linear regression is proposed, and its fitting method by EM algorithm and Lagrange multiplier is provided. This model is applied to gene clustering of microarray expression data, which demonstrates it has very good performances for real data set. This model also allows to obtain the clusters that an analyst wants to find out in the fashion that the hypothesis for component-means is represented by the design matrices and the linear restriction matrices.

Comparison of Document Clustering algorithm using Genetic Algorithms by Individual Structures (개체 구조에 따른 유전자 알고리즘 기반의 문서 클러스터링 성능 비교)

  • Choi, Lim-Cheon;Song, Wei;Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.16 no.3
    • /
    • pp.47-56
    • /
    • 2011
  • To apply Genetic algorithm toward document clustering, appropriate individual structure is required. Document clustering with the genetic algorithms (DCGA) uses the centroid vector type individual structure. New document clustering with the genetic algorithm (NDAGA) uses document allocated individual structure. In this paper, to find more suitable object structure and process for the document clustering, calculation, amount of calculation, run-time, and performance difference between the two methods were analyzed. In this paper, we have performed various experiments using both DCGA and NDCGA. Result of the experiment shows that compared to DCGA, NDCGA provided 15% faster execution time, about 5~10% better performance. This proves that the document allocated structure is more fitted than the centroid vector type structure when it comes to document clustering. In addition, NDCGA showed 15~25% better performance than the traditional clustering algorithms (K-means, Group Average).

Proposal of Cluster Head Election Method in K-means Clustering based WSN (K-평균 군집화 기반 WSN에서 클러스터 헤드 선택 방법 제안)

  • Yun, Dai Yeol;Park, SeaYoung;Hwang, Chi-Gon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.447-449
    • /
    • 2021
  • Various wireless sensor network protocols have been proposed to maintain the network for a long time by minimizing energy consumption. Using the K-means clustering algorithm takes longer to cluster than traditional hierarchical algorithms because the center point must be moved repeatedly until the final cluster is established. For K-means clustering-based protocols, only the residual energy of nodes or nodes near the center point of the cluster is considered when the cluster head is elected. In this paper, we propose a new wireless sensor network protocol based on K-means clustering to improve the energy efficiency while improving the aforementioned problems.

  • PDF

Clustering Method of Weighted Preference Using K-means Algorithm and Bayesian Network for Recommender System (추천시스템을 위한 k-means 기법과 베이시안 네트워크를 이용한 가중치 선호도 군집 방법)

  • Park, Wha-Beum;Cho, Young-Sung;Ko, Hyung-Hwa
    • Journal of Information Technology Applications and Management
    • /
    • v.20 no.3_spc
    • /
    • pp.219-230
    • /
    • 2013
  • Real time accessiblity and agility in Ubiquitous-commerce is required under ubiquitous computing environment. The Research has been actively processed in e-commerce so as to improve the accuracy of recommendation. Existing Collaborative filtering (CF) can not reflect contents of the items and has the problem of the process of selection in the neighborhood user group and the problems of sparsity and scalability as well. Although a system has been practically used to improve these defects, it still does not reflect attributes of the item. In this paper, to solve this problem, We can use a implicit method which is used by customer's data and purchase history data. We propose a new clustering method of weighted preference for customer using k-means clustering and Bayesian network in order to improve the accuracy of recommendation. To verify improved performance of the proposed system, we make experiments with dataset collected in a cosmetic internet shopping mall.

Design of Pattern Classification Rule based on Local Linear Discriminant Analysis Classifier by using Differential Evolutionary Algorithm (차분진화 알고리즘을 이용한 지역 Linear Discriminant Analysis Classifier 기반 패턴 분류 규칙 설계)

  • Roh, Seok-Beom;Hwang, Eun-Jin;Ahn, Tae-Chon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.1
    • /
    • pp.81-86
    • /
    • 2012
  • In this paper, we proposed a new design methodology of a pattern classification rule based on the local linear discriminant analysis expanded from the generic linear discriminant analysis which is used in the local area divided from the whole input space. There are two ways such as k-Means clustering method and the differential evolutionary algorithm to partition the whole input space into the several local areas. K-Means clustering method is the one of the unsupervised clustering methods and the differential evolutionary algorithm is the one of the optimization algorithms. In addition, the experimental application covers a comparative analysis including several previously commonly encountered methods.