통합 검색 | Korea Science

On hierarchical clustering in sufficient dimension reduction

Yoo, Chaeyeon;Yoo, Younju;Um, Hye Yeon;Yoo, Jae Keun
- Communications for Statistical Applications and Methods
- /
- 제27권4호
- /
- pp.431-443
- /
- 2020
The K-means clustering algorithm has had successful application in sufficient dimension reduction. Unfortunately, the algorithm does have reproducibility and nestness, which will be discussed in this paper. These are clear deficits for the K-means clustering algorithm; however, the hierarchical clustering algorithm has both reproducibility and nestness, but intensive comparison between K-means and hierarchical clustering algorithm has not yet been done in a sufficient dimension reduction context. In this paper, we rigorously study the two clustering algorithms for two popular sufficient dimension reduction methodology of inverse mean and clustering mean methods throughout intensive numerical studies. Simulation studies and two real data examples confirm that the use of hierarchical clustering algorithm has a potential advantage over the K-means algorithm.
https://doi.org/10.29220/CSAM.2020.27.4.431 인용 PDF KSCI

군집의 효율향상을 위한 휴리스틱 알고리즘 (Heuristic algorithm to raise efficiency in clustering)

이석환;박승헌
- 대한안전경영과학회지
- /
- 제11권3호
- /
- pp.157-166
- /
- 2009
In this study, we developed a heuristic algorithm to get better efficiency of clustering than conventional algorithms. Conventional clustering algorithm had lower efficiency of clustering as there were no solid method for selecting initial center of cluster and as they had difficulty in search solution for clustering. EMC(Expanded Moving Center) heuristic algorithm was suggested to clear the problem of low efficiency in clustering. We developed algorithm to select initial center of cluster and search solution systematically in clustering. Experiments of clustering are performed to evaluate performance of EMC heuristic algorithm. Squared-error of EMC heuristic algorithm showed better performance for real case study and improved greatly with increase of cluster number than the other ones.
PDF KSCI

More Efficient k-Modes Clustering Algorithm

Kim, Dae-Won;Chae, Yi-Geun
- Journal of the Korean Data and Information Science Society
- /
- 제16권3호
- /
- pp.549-556
- /
- 2005
A hard-type centroids in the conventional clustering algorithm such as k-modes algorithm cannot keep the uncertainty inherently in data sets as long as possible before actual clustering(decision) are made. Therefore, we propose the k-populations algorithm to extend clustering ability and to heed the data characteristics. This k-population algorithm as found to give markedly better clustering results through various experiments.
PDF

A Stigmergy-and-Neighborhood Based Ant Algorithm for Clustering Data

Lee, Hee-Sang;Shim, Gyu-Seok
- Management Science and Financial Engineering
- /
- 제15권1호
- /
- pp.81-96
- /
- 2009
Data mining, specially clustering is one of exciting research areas for ant based algorithms. Ant clustering algorithm, however, has many difficulties for resolving practical situations in clustering. We propose a new grid-based ant colony algorithm for clustering of data. The previous ant based clustering algorithms usually tried to find the clusters during picking up or dropping down process of the items of ants using some stigmergy information. In our ant clustering algorithm we try to make the ants reflect neighborhood information within the storage nests. We use two ant classes, search ants and labor ants. In the initial step of the proposed algorithm, the search ants try to guide the characteristics of the storage nests. Then the labor ants try to classify the items using the guide in-formation that has set by the search ants and the stigmergy information that has set by other labor ants. In this procedure the clustering decision of ants is quickly guided and keeping out of from the stagnated process. We experimented and compared our algorithm with other known algorithms for the known and statistically-made data. From these experiments we prove that the suggested ant mining algorithm found the clusters quickly and effectively comparing with a known ant clustering algorithm.
PDF KSCI

Path based K-means Clustering for RFID Data Sets

Yun, Hong-Won
- Journal of information and communication convergence engineering
- /
- 제6권4호
- /
- pp.434-438
- /
- 2008
Massive data are continuously produced with a data rate of over several terabytes every day. These applications need effective clustering algorithms to achieve an overall high performance computation. In this paper, we propose ancestor as cluster center based approach to clustering, the K-means algorithm using ancestor. We modify the K-means algorithm. We present a clustering architecture and a clustering algorithm that minimize of I/Os and show a performance with excellent. In our experimental performance evaluation, we present that our algorithm can improve the I/O speed and the query processing time.
PDF KSCI

An Improved Automated Spectral Clustering Algorithm

Xiaodan Lv
- Journal of Information Processing Systems
- /
- 제20권2호
- /
- pp.185-199
- /
- 2024
In this paper, an improved automated spectral clustering (IASC) algorithm is proposed to address the limitations of the traditional spectral clustering (TSC) algorithm, particularly its inability to automatically determine the number of clusters. Firstly, a cluster number evaluation factor based on the optimal clustering principle is proposed. By iterating through different k values, the value corresponding to the largest evaluation factor was selected as the first-rank number of clusters. Secondly, the IASC algorithm adopts a density-sensitive distance to measure the similarity between the sample points. This rendered a high similarity to the data distributed in the same high-density area. Thirdly, to improve clustering accuracy, the IASC algorithm uses the cosine angle classification method instead of K-means to classify the eigenvectors. Six algorithms-K-means, fuzzy C-means, TSC, EIGENGAP, DBSCAN, and density peak-were compared with the proposed algorithm on six datasets. The results show that the IASC algorithm not only automatically determines the number of clusters but also obtains better clustering accuracy on both synthetic and UCI datasets.
https://doi.org/10.3745/JIPS.04.0307 인용 PDF

Medoid Determination in Deterministic Annealing-based Pairwise Clustering

Lee, Kyung-Mi;Lee, Keon-Myung
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- 제11권3호
- /
- pp.178-183
- /
- 2011
The deterministic annealing-based clustering algorithm is an EM-based algorithm which behaves like simulated annealing method, yet less sensitive to the initialization of parameters. Pairwise clustering is a kind of clustering technique to perform clustering with inter-entity distance information but not enforcing to have detailed attribute information. The pairwise deterministic annealing-based clustering algorithm repeatedly alternates the steps of estimation of mean-fields and the update of membership degrees of data objects to clusters until termination condition holds. Lacking of attribute value information, pairwise clustering algorithms do not explicitly determine the centroids or medoids of clusters in the course of clustering process or at the end of the process. This paper proposes a method to identify the medoids as the centers of formed clusters for the pairwise deterministic annealing-based clustering algorithm. Experimental results show that the proposed method locate meaningful medoids.
https://doi.org/10.5391/IJFIS.2011.11.3.178 인용 PDF KSCI

데이터 마이닝에서 그룹 세분화를 위한 2단계 계층적 글러스터링 알고리듬 (Two Phase Hierarchical Clustering Algorithm for Group Formation in Data Mining)

황인수
- 경영과학
- /
- 제19권1호
- /
- pp.189-196
- /
- 2002
Data clustering is often one of the first steps in data mining analysis. It Identifies groups of related objects that can be used as a starling point for exploring further relationships. This technique supports the development of population segmentation models, such as demographic-based customer segmentation. This paper Purpose to present the development of two phase hierarchical clustering algorithm for group formation. Applications of the algorithm for product-customer group formation in customer relationahip management are also discussed. As a result of computer simulations, suggested algorithm outperforms single link method and k-means clustering.
PDF KSCI

Geodesic Clustering for Covariance Matrices

Lee, Haesung;Ahn, Hyun-Jung;Kim, Kwang-Rae;Kim, Peter T.;Koo, Ja-Yong
- Communications for Statistical Applications and Methods
- /
- 제22권4호
- /
- pp.321-331
- /
- 2015
The K-means clustering algorithm is a popular and widely used method for clustering. For covariance matrices, we consider a geodesic clustering algorithm based on the K-means clustering framework in consideration of symmetric positive definite matrices as a Riemannian (non-Euclidean) manifold. This paper considers a geodesic clustering algorithm for data consisting of symmetric positive definite (SPD) matrices, utilizing the Riemannian geometric structure for SPD matrices and the idea of a K-means clustering algorithm. A K-means clustering algorithm is divided into two main steps for which we need a dissimilarity measure between two matrix data points and a way of computing centroids for observations in clusters. In order to use the Riemannian structure, we adopt the geodesic distance and the intrinsic mean for symmetric positive definite matrices. We demonstrate our proposed method through simulations as well as application to real financial data.
https://doi.org/10.5351/CSAM.2015.22.4.321 인용 PDF KSCI

A Density Peak Clustering Algorithm Based on Information Bottleneck

Yongli Liu;Congcong Zhao;Hao Chao
- Journal of Information Processing Systems
- /
- 제19권6호
- /
- pp.778-790
- /
- 2023
Although density peak clustering can often easily yield excellent results, there is still room for improvement when dealing with complex, high-dimensional datasets. One of the main limitations of this algorithm is its reliance on geometric distance as the sole similarity measurement. To address this limitation, we draw inspiration from the information bottleneck theory, and propose a novel density peak clustering algorithm that incorporates this theory as a similarity measure. Specifically, our algorithm utilizes the joint probability distribution between data objects and feature information, and employs the loss of mutual information as the measurement standard. This approach not only eliminates the potential for subjective error in selecting similarity method, but also enhances performance on datasets with multiple centers and high dimensionality. To evaluate the effectiveness of our algorithm, we conducted experiments using ten carefully selected datasets and compared the results with three other algorithms. The experimental results demonstrate that our information bottleneck-based density peaks clustering (IBDPC) algorithm consistently achieves high levels of accuracy, highlighting its potential as a valuable tool for data clustering tasks.
https://doi.org/10.3745/JIPS.04.0294 인용 PDF

검색결과 2,050건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)