• Title, Summary, Keyword: Clustering

Search Result 5,183, Processing Time 0.047 seconds

A Stigmergy-and-Neighborhood Based Ant Algorithm for Clustering Data

  • Lee, Hee-Sang;Shim, Gyu-Seok
    • Management Science and Financial Engineering
    • /
    • v.15 no.1
    • /
    • pp.81-96
    • /
    • 2009
  • Data mining, specially clustering is one of exciting research areas for ant based algorithms. Ant clustering algorithm, however, has many difficulties for resolving practical situations in clustering. We propose a new grid-based ant colony algorithm for clustering of data. The previous ant based clustering algorithms usually tried to find the clusters during picking up or dropping down process of the items of ants using some stigmergy information. In our ant clustering algorithm we try to make the ants reflect neighborhood information within the storage nests. We use two ant classes, search ants and labor ants. In the initial step of the proposed algorithm, the search ants try to guide the characteristics of the storage nests. Then the labor ants try to classify the items using the guide in-formation that has set by the search ants and the stigmergy information that has set by other labor ants. In this procedure the clustering decision of ants is quickly guided and keeping out of from the stagnated process. We experimented and compared our algorithm with other known algorithms for the known and statistically-made data. From these experiments we prove that the suggested ant mining algorithm found the clusters quickly and effectively comparing with a known ant clustering algorithm.

Magnetoencephalography Interictal Spike Clustering in Relation with Surgical Outcome of Cortical Dysplasia

  • Jeong, Woorim;Chung, Chun Kee;Kim, June Sic
    • Journal of Korean Neurosurgical Society
    • /
    • v.52 no.5
    • /
    • pp.466-471
    • /
    • 2012
  • Objective : The aim of this study was to devise an objective clustering method for magnetoencephalography (MEG) interictal spike sources, and to identify the prognostic value of the new clustering method in adult epilepsy patients with cortical dysplasia (CD). Methods : We retrospectively analyzed 25 adult patients with histologically proven CD, who underwent MEG examination and surgical resection for intractable epilepsy. The mean postoperative follow-up period was 3.1 years. A hierarchical clustering method was adopted for MEG interictal spike source clustering. Clustered sources were then tested for their prognostic value toward surgical outcome. Results : Postoperative seizure outcome was Engel class I in 6 (24%), class II in 3 (12%), class III in 12 (48%), and class IV in 4 (16%) patients. With respect to MEG spike clustering, 12 of 25 (48%) patients showed 1 cluster, 2 (8%) showed 2 or more clusters within the same lobe, 10 (40%) showed 2 or more clusters in a different lobe, and 1 (4%) patient had only scattered spikes with no clustering. Patients who showed focal clustering achieved better surgical outcome than distributed cases (p=0.017). Conclusion : This is the first study that introduces an objective method to classify the distribution of MEG interictal spike sources. By using a hierarchical clustering method, we found that the presence of focal clustered spikes predicts a better postoperative outcome in epilepsy patients with CD.

Automatic Switching of Clustering Methods based on Fuzzy Inference in Bibliographic Big Data Retrieval System

  • Zolkepli, Maslina;Dong, Fangyan;Hirota, Kaoru
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.256-267
    • /
    • 2014
  • An automatic switch among ensembles of clustering algorithms is proposed as a part of the bibliographic big data retrieval system by utilizing a fuzzy inference engine as a decision support tool to select the fastest performing clustering algorithm between fuzzy C-means (FCM) clustering, Newman-Girvan clustering, and the combination of both. It aims to realize the best clustering performance with the reduction of computational complexity from O($n^3$) to O(n). The automatic switch is developed by using fuzzy logic controller written in Java and accepts 3 inputs from each clustering result, i.e., number of clusters, number of vertices, and time taken to complete the clustering process. The experimental results on PC (Intel Core i5-3210M at 2.50 GHz) demonstrates that the combination of both clustering algorithms is selected as the best performing algorithm in 20 out of 27 cases with the highest percentage of 83.99%, completed in 161 seconds. The self-adapted FCM is selected as the best performing algorithm in 4 cases and the Newman-Girvan is selected in 3 cases.The automatic switch is to be incorporated into the bibliographic big data retrieval system that focuses on visualization of fuzzy relationship using hybrid approach combining FCM and Newman-Girvan algorithm, and is planning to be released to the public through the Internet.

A Novel Multi-Path Routing Algorithm Based on Clustering for Wireless Mesh Networks

  • Liu, Chun-Xiao;Zhang, Yan;Xu, E;Yang, Yu-Qiang;Zhao, Xu-Hui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.4
    • /
    • pp.1256-1275
    • /
    • 2014
  • As one of the new self-organizing and self-configuration broadband networks, wireless mesh networks are being increasingly attractive. In order to solve the load balancing problem in wireless mesh networks, this paper proposes a novel multi-path routing algorithm based on clustering (Cluster_MMesh) for wireless mesh networks. In the clustering stage, on the basis of the maximum connectivity clustering algorithm and k-hop clustering algorithm, according to the idea of maximum connectivity, a new concept of node connectivity degree is proposed in this paper, which can make the selection of cluster head more simple and reasonable. While clustering, the node which has less expected load in the candidate border gateway node set will be selected as the border gateway node. In the multi-path routing establishment stage, we use the intra-clustering multi-path routing algorithm and inter-clustering multi-path routing algorithm to establish multi-path routing from the source node to the destination node. At last, in the traffic allocation stage, we will use the virtual disjoint multi-path model (Vdmp) to allocate the network traffic. Simulation results show that the Cluster_MMesh routing algorithm can help increase the packet delivery rate, reduce the average end to end delay, and improve the network performance.

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of the Korean Society of Marine Engineering
    • /
    • v.40 no.8
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.

Clustering Algorithm for Sequences of Categorical Values (범주형 값들이 순서를 가지고 있는 데이터들의 클러스터링 기법)

  • 오승준;김재련
    • Journal of the Society of Korea Industrial and Systems Engineering
    • /
    • v.26 no.1
    • /
    • pp.17-21
    • /
    • 2003
  • We study clustering algorithm for sequences of categorical values. Clustering is a data mining problem that has received significant attention by the database community. Traditional clustering algorithms deal with numerical or categorical data points. However, there exist many important databases that store categorical data sequences. In this paper, we introduce new similarity measure and develop a hierarchical clustering algorithm. An experimental section shows performance of the proposed approach.

An Improved K-means Document Clustering using Concept Vectors

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.853-861
    • /
    • 2003
  • An improved K-means document clustering method has been presented, where a concept vector is manipulated for each cluster on the basis of cosine similarity of text documents. The concept vectors are unit vectors that have been normalized on the n-dimensional sphere. Because the standard K-means method is sensitive to initial starting condition, our improvement focused on starting condition for estimating the modes of a distribution. The improved K-means clustering algorithm has been applied to a set of text documents, called Classic3, to test and prove efficiency and correctness of clustering result, and showed 7% improvements in its worst case.

  • PDF

A study on the measurement for multidimensional entity clustering (다차원 clustering문제를 위한 척도에 관한 연구)

  • Lee, Cheol
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • /
    • pp.30-39
    • /
    • 1989
  • 일반적으로 cluster의 수가 미정인 상황하에서의 clustering문제는 semistructured문제로 알려져 있다. clustering문제를 구조화하는데 있어서 해의 품질평가(evaluation of solution quality)가 필수적이나 각 응용분야에 널리 적용될 수 있는 척도는 아직까지 개발되어있지 못한 상태이다. 그 주된 원인은 cluster해에 대한 개념적 차원에서의 평가기준은 제시되어있으나 척도의 구현에 있어서는 제시된 개념들이 명확하게 적용될 정도의 수준으로는 구체화되지 못한데에 기인한다고 할 수 있다. 본 연구의 목적은 개체차원이 다차원으로 확장된 clustering문제를 대상으로하는 clustering문제의 척도개발에 있다.

  • PDF

An Agglomerative Hierarchical Variable-Clustering Method Based on a Correlation Matrix

  • Lee, Kwangjin
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.387-397
    • /
    • 2003
  • Generally, most of researches that need a variable-clustering process use an exploratory factor analysis technique or a divisive hierarchical variable-clustering method based on a correlation matrix. And some researchers apply a object-clustering method to a distance matrix transformed from a correlation matrix, though this approach is known to be improper. On this paper an agglomerative hierarchical variable-clustering method based on a correlation matrix itself is suggested. It is derived from a geometric concept by using variate-spaces and a characterizing variate.

The Difference Order Clustering for Multi-dimensional Entities (다차원 개체를 위한 차이등급 clustering)

  • Rhee, Chul;Kang, Suk-Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.14 no.1
    • /
    • pp.108-118
    • /
    • 1989
  • The clustering problem for multi-dimensional entities is investigated. A heuristic method, which is named as Difference Order Clustering (DOC) is developed for the grouping of multi-dimensional entities DOC method has an advantage of identifying the bottle-neck entities. Comparisons among the proposed DOC method, modified rank order clustering (MODROC) method, and lexicographical rank order clustering using minimum spanning tree (lexico-MMSTROC) are illustrated by a part type selection problems.

  • PDF