• Title/Summary/Keyword: divisive clustering

Search Result 9, Processing Time 0.023 seconds

Double monothetic clustering for histogram-valued data

  • Kim, Jaejik;Billard, L.
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.3
    • /
    • pp.263-274
    • /
    • 2018
  • One of the common issues in large dataset analyses is to detect and construct homogeneous groups of objects in those datasets. This is typically done by some form of clustering technique. In this study, we present a divisive hierarchical clustering method for two monothetic characteristics of histogram data. Unlike classical data points, a histogram has internal variation of itself as well as location information. However, to find the optimal bipartition, existing divisive monothetic clustering methods for histogram data consider only location information as a monothetic characteristic and they cannot distinguish histograms with the same location but different internal variations. Thus, a divisive clustering method considering both location and internal variation of histograms is proposed in this study. The method has an advantage in interpreting clustering outcomes by providing binary questions for each split. The proposed clustering method is verified through a simulation study and applied to a large U.S. house property value dataset.

Heuristic Algorithm for High-Speed Clustering of Neighbor Vehicular Position Coordinate (주변 차량 위치 좌표의 고속 클러스터링을 위한 휴리스틱 알고리즘)

  • Choi, Yoon-Ho;Yoo, Seung-Ho;Seo, Seung-Woo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39C no.4
    • /
    • pp.343-350
    • /
    • 2014
  • Divisive hierarchical clustering algorithms iterate the process of decomposition and clustering data recursively. In each recursive call, data in each cluster are arbitrarily selected and thus, the total clustering time can be increased, which causes a problem that it is difficult to apply the process of clustering neighbor vehicular position data in vehicular localization. In this paper, we propose a new heuristic algorithm for speeding up the clustering time by eliminating randomness of the selected data in the process of generating the initial divisive clusters.

EXTENDED ONLINE DIVISIVE AGGLOMERATIVE CLUSTERING

  • Musa, Ibrahim Musa Ishag;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.406-409
    • /
    • 2008
  • Clustering data streams has an importance over many applications like sensor networks. Existing hierarchical methods follow a semi fuzzy clustering that yields duplicate clusters. In order to solve the problems, we propose an extended online divisive agglomerative clustering on data streams. It builds a tree-like top-down hierarchy of clusters that evolves with data streams using geometric time frame for snapshots. It is an enhancement of the Online Divisive Agglomerative Clustering (ODAC) with a pruning strategy to avoid duplicate clusters. Our main features are providing update time and memory space which is independent of the number of examples on data streams. It can be utilized for clustering sensor data and network monitoring as well as web click streams.

  • PDF

An Agglomerative Hierarchical Variable-Clustering Method Based on a Correlation Matrix

  • Lee, Kwangjin
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.387-397
    • /
    • 2003
  • Generally, most of researches that need a variable-clustering process use an exploratory factor analysis technique or a divisive hierarchical variable-clustering method based on a correlation matrix. And some researchers apply a object-clustering method to a distance matrix transformed from a correlation matrix, though this approach is known to be improper. On this paper an agglomerative hierarchical variable-clustering method based on a correlation matrix itself is suggested. It is derived from a geometric concept by using variate-spaces and a characterizing variate.

A Hashing Method Using PCA-based Clustering (PCA 기반 군집화를 이용한 해슁 기법)

  • Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.6
    • /
    • pp.215-218
    • /
    • 2014
  • In hashing-based methods for approximate nearest neighbors(ANN) search, by mapping data points to k-bit binary codes, nearest neighbors are searched in a binary embedding space. In this paper, we present a hashing method using a PCA-based clustering method, Principal Direction Divisive Partitioning(PDDP). PDDP is a clustering method which repeatedly partitions the cluster with the largest variance into two clusters by using the first principal direction. The proposed hashing method utilizes the first principal direction as a projective direction for binary coding. Experimental results demonstrate that the proposed method is competitive compared with other hashing methods.

A Survey of Advances in Hierarchical Clustering Algorithms and Applications

  • Munshi, Amr
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.17-24
    • /
    • 2022
  • Hierarchical clustering methods have been proposed for more than sixty years and yet are used in various disciplines for relation observation and clustering purposes. In 1965, divisive hierarchical methods were proposed in biological sciences and have been used in various disciplines such as, and anthropology, ecology. Furthermore, recently hierarchical methods are being deployed in economy and energy studies. Unlike most clustering algorithms that require the number of clusters to be specified by the user, hierarchical clustering is well suited for situations where the number of clusters is unknown. This paper presents an overview of the hierarchical clustering algorithm. The dissimilarity measurements that can be utilized in hierarchical clustering algorithms are discussed. Further, the paper highlights the various and recent disciplines where the hierarchical clustering algorithms are employed.

A Divisive Clustering for Mixed Feature-Type Symbolic Data (혼합형태 심볼릭 데이터의 군집분석방법)

  • Kim, Jaejik
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.6
    • /
    • pp.1147-1161
    • /
    • 2015
  • Nowadays we are considering and analyzing not only classical data expressed by points in the p-dimensional Euclidean space but also new types of data such as signals, functions, images, and shapes, etc. Symbolic data also can be considered as one of those new types of data. Symbolic data can have various formats such as intervals, histograms, lists, tables, distributions, models, and the like. Up to date, symbolic data studies have mainly focused on individual formats of symbolic data. In this study, it is extended into datasets with both histogram and multimodal-valued data and a divisive clustering method for the mixed feature-type symbolic data is introduced and it is applied to the analysis of industrial accident data.

An Interactive e-HealthCare Framework Utilizing Online Hierarchical Clustering Method (온라인 계층적 군집화 기법을 활용한 양방향 헬스케어 프레임워크)

  • Musa, Ibrahim Musa Ishag;Jung, Sukho;Shin, DongMun;Yi, Gyeong Min;Lee, Dong Gyu;Sohn, Gyoyong;Ryu, Keun Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.399-400
    • /
    • 2009
  • As a part of the era of human centric applications people started to care about their well being utilizing any possible mean. This paper proposes a framework for real time on-body sensor health-care system, addresses the current issues in such systems, and utilizes an enhanced online divisive agglomerative clustering algorithm (EODAC); an algorithm that builds a top-down tree-like structure of clusters that evolves with streaming data to rationally cluster on-body sensor data and give accurate diagnoses remotely, guaranteeing high performance, and scalability. Furthermore it does not depend on the number of data points.

A Comparative Study on the Agglomerative and Divisive Methods for Hierarchical Document Clustering (계층적 문서 클러스터링을 위한 응집식 기법과 분할식 기법의 비교 연구)

  • Lee, Jae-Yun;Jeong, Jin-Ah
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2005.08a
    • /
    • pp.65-70
    • /
    • 2005
  • 계층적 문서 클러스터링에 있어서 실험집단에 따라 응집식 기법과 분할식 기법의 성능이 다르며, 이를 좌우하는 요소는 분류의 깊이, 즉 분류수준이라고 가정하였다. 조금만 나누면 되는 대분류인 경우는 상대적으로 분할식 기법이 유리하고, 조금만 합치면 되는 소분류인 경우에는 응집식 기법이 유리할 것이라고 판단했기 때문이다. 그에 따라 분할식 클러스터링 기법인 양분(Bisecting) K-means기법과 응집식 기법인 완전연결, 평균연결, WARD기법의 성능을 실험집단이 대분류인 경우와 소분류인 경우의 유사계수를 적용하여 각 기법별 성능을 비교하여 실험집단의 특성에 따른 적합 클러스터링 기법을 찾고자 하였다. 실험결과 응집식 기법과 분할식 기법의 성능 우열에 영향을 미치는 것은 분류수준보다는 변이계수로 측정된 상대적인 군집의 크기 편차인 것으로 나타났다.

  • PDF