• Title/Summary/Keyword: Clustering algorithms

Search Result 611, Processing Time 0.034 seconds

High-Dimensional Clustering Technique using Incremental Projection (점진적 프로젝션을 이용한 고차원 글러스터링 기법)

  • Lee, Hye-Myung;Park, Young-Bae
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.568-576
    • /
    • 2001
  • Most of clustering algorithms data to degenerate rapidly on high dimensional spaces. Moreover, high dimensional data often contain a significant a significant of noise. which causes additional ineffectiveness of algorithms. Therefore it is necessary to develop algorithms adapted to the structure and characteristics of the high dimensional data. In this paper, we propose a clustering algorithms CLIP using the projection The CLIP is designed to overcome efficiency and/or effectiveness problems on high dimensional clustering and it is the is based on clustering on each one dimensional subspace but we use the incremental projection to recover high dimensional cluster and to reduce the computational cost significantly at time To evaluate the performance of CLIP we demonstrate is efficiency and effectiveness through a series of experiments on synthetic data sets.

  • PDF

Improvements of K-modes Algorithm and ROCK Algorithm (K-모드 알고리즘과 ROCK 알고리즘의 개선)

  • 김보화;김규성
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.381-393
    • /
    • 2002
  • K-modes algorithm and ROCK(RObust Clustering using linKs) algorithm we useful clustering methods for large categorical data. In the paper, we investigate these algorithms and propose improved algorithms of them to correct their weakness. A simulation study shows that the proposed algorithms could increase the performance of data clustering.

Cluster Analysis with Balancing Weight on Mixed-type Data

  • Chae, Seong-San;Kim, Jong-Min;Yang, Wan-Youn
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.3
    • /
    • pp.719-732
    • /
    • 2006
  • A set of clustering algorithms with proper weight on the formulation of distance which extend to mixed numeric and multiple binary values is presented. A simple matching and Jaccard coefficients are used to measure similarity between objects for multiple binary attributes. Similarities are converted to dissimilarities between i th and j th objects. The performance of clustering algorithms with balancing weight on different similarity measures is demonstrated. Our experiments show that clustering algorithms with application of proper weight give competitive recovery level when a set of data with mixed numeric and multiple binary attributes is clustered.

Recovering Module View of Software Architecture using Community Detection Algorithm (커뮤니티 검출기법을 이용한 소프트웨어 아키텍쳐 모듈 뷰 복원)

  • Kim, Jungmin;Lee, Changun
    • Journal of Software Engineering Society
    • /
    • v.25 no.4
    • /
    • pp.69-74
    • /
    • 2012
  • This article suggests applicability to community detection algorithm from module recovering process of software architecture through compare to software clustering metric and community dectection metric. in addition to, analyze mutual relation and difference between separated module and measurement value of typical clustering algorithms and community detection algorithms. and then only sugeested several kinds basis that community detection algorithm can use to recovering module view of software architecture and, by so comparing measurement value of existing clustering metric and community algorithms, this article suggested correlation of two result data.

  • PDF

Design and Comparison of Error Correctors Using Clustering in Holographic Data Storage System

  • Kim, Sang-Hoon;Kim, Jang-Hyun;Yang, Hyun-Seok;Park, Young-Pil
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1076-1079
    • /
    • 2005
  • Data storage related with writing and retrieving requires high storage capacity, fast transfer rate and less access time in. Today any data storage system can not satisfy these conditions, but holographic data storage system can perform faster data transfer rate because it is a page oriented memory system using volume hologram in writing and retrieving data. System architecture without mechanical actuating part is possible, so fast data transfer rate and high storage capacity about 1Tb/cm3 can be realized. In this paper, to correct errors of binary data stored in holographic digital data storage system, find cluster centers using clustering algorithm and reduce intensities of pixels around centers. We archive the procedure by two algorithms of C-mean and subtractive clustering, and compare the results of the two algorithms. By using proper clustering algorithm, the intensity profile of data page will be uniform and the better data storage system can be realized.

  • PDF

Nonnegative Matrix Factorization with Orthogonality Constraints

  • Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of Computing Science and Engineering
    • /
    • v.4 no.2
    • /
    • pp.97-109
    • /
    • 2010
  • Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, which is to decompose a data matrix into a product of two factor matrices with all entries restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropriate to the clustering problems. In this paper, we present an algorithm for orthogonal nonnegative matrix factorization, where an orthogonality constraint is imposed on the nonnegative decomposition of a term-document matrix. The result of orthogonal NMF can be clearly interpreted for the clustering problems, and also the performance of clustering is usually better than that of the NMF. We develop multiplicative updates directly from true gradient on Stiefel manifold, whereas existing algorithms consider additive orthogonality constraints. Experiments on several different document data sets show our orthogonal NMF algorithms perform better in a task of clustering, compared to the standard NMF and an existing orthogonal NMF.

Design and Comparison of Error Reduction Methods Using Clustering in Holographic Data Storage System (홀로그래픽 정보 저장 장치에서 클러스터링을 이용한 에러 감소 기법 제안 및 비교)

  • Kim Sang-Hoon;Kim Jang-Hyun;Yang Hyun-Seok;Park Young-Pil
    • 정보저장시스템학회:학술대회논문집
    • /
    • 2005.10a
    • /
    • pp.83-87
    • /
    • 2005
  • Data storage related with writing and retrieving requires high storage capacity, fast transfer rate and less access time in. Today any data storage system can not satisfy these conditions, but holographic data storage system can perform faster data transfer rate because it is a page oriented memory system using volume hologram in writing and retrieving data. System architecture without mechanical actuating pare is possible, so fast data transfer rate and high storage capacity about 1Tb/cm3 can be realized. In this paper, to correct errors of binary data stored in holographic digital data storage system, find cluster centers using clustering algorithm and reduce intensities of pixels around centers. We archive the procedure by two algorithms of C-mean and subtractive clustering, and compare the results of the two algorithms. By using proper clustering algorithm, the intensity profile of data page will be uniform and the better data storage system can be realized.

  • PDF

A Hierarchical Clustering Algorithm Using Extended Sequence Element-based Similarity Measure (확장된 시퀀스 요소 기반의 유사도를 이용한 계층적 클러스터링 알고리즘)

  • Oh, Seung-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.321-327
    • /
    • 2006
  • Recently there has been enormous growth in the amount of commercial and scientific data. Such datasets consist of sequence data that have an inherent sequential nature. However, only a few of the existing clustering algorithms consider sequentiality. This study presents a similarity measure and a method for clustering such sequence datasets. Especially, we present an extended concept of the measure of similarity, which considers various conditions. Using a splice dataset, we show that the quality of clusters generated by our proposed clustering algorithm is better than that of clusters produced by traditional clustering algorithms.

  • PDF

The Document Clustering using Multi-Objective Genetic Algorithms (다목적 유전자 알고리즘을 이용한문서 클러스터링)

  • Lee, Jung-Song;Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.2
    • /
    • pp.57-64
    • /
    • 2012
  • In this paper, the multi-objective genetic algorithm is proposed for the document clustering which is important in the text mining field. The most important function in the document clustering algorithm is to group the similar documents in a corpus. So far, the k-means clustering and genetic algorithms are much in progress in this field. However, the k-means clustering depends too much on the initial centroid, the genetic algorithm has the disadvantage of coming off in the local optimal value easily according to the fitness function. In this paper, the multi-objective genetic algorithm is applied to the document clustering in order to complement these disadvantages while its accuracy is analyzed and compared to the existing algorithms. In our experimental results, the multi-objective genetic algorithm introduced in this paper shows the accuracy improvement which is superior to the k-means clustering(about 20 %) and the general genetic algorithm (about 17 %) for the document clustering.

Clustering Routing Algorithms In Wireless Sensor Networks: An Overview

  • Liu, Xuxun;Shi, Jinglun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.7
    • /
    • pp.1735-1755
    • /
    • 2012
  • Wireless sensor networks (WSNs) are becoming increasingly attractive for a variety of applications and have become a hot research area. Routing is a key technology in WSNs and can be coarsely divided into two categories: flat routing and hierarchical routing. In a flat topology, all nodes perform the same task and have the same functionality in the network. In contrast, nodes in a hierarchical topology perform different tasks in WSNs and are typically organized into lots of clusters according to specific requirements or metrics. Owing to a variety of advantages, clustering routing protocols are becoming an active branch of routing technology in WSNs. In this paper, we present an overview on clustering routing algorithms for WSNs with focus on differentiating them according to diverse cluster shapes. We outline the main advantages of clustering and discuss the classification of clustering routing protocols in WSNs. In particular, we systematically analyze the typical clustering routing protocols in WSNs and compare the different approaches based on various metrics. Finally, we conclude the paper with some open questions.