Search | Korea Science

Double monothetic clustering for histogram-valued data

Kim, Jaejik;Billard, L.
- Communications for Statistical Applications and Methods
- /
- v.25 no.3
- /
- pp.263-274
- /
- 2018
One of the common issues in large dataset analyses is to detect and construct homogeneous groups of objects in those datasets. This is typically done by some form of clustering technique. In this study, we present a divisive hierarchical clustering method for two monothetic characteristics of histogram data. Unlike classical data points, a histogram has internal variation of itself as well as location information. However, to find the optimal bipartition, existing divisive monothetic clustering methods for histogram data consider only location information as a monothetic characteristic and they cannot distinguish histograms with the same location but different internal variations. Thus, a divisive clustering method considering both location and internal variation of histograms is proposed in this study. The method has an advantage in interpreting clustering outcomes by providing binary questions for each split. The proposed clustering method is verified through a simulation study and applied to a large U.S. house property value dataset.
https://doi.org/10.29220/CSAM.2018.25.3.263 인용 PDF KSCI

Heuristic Algorithm for High-Speed Clustering of Neighbor Vehicular Position Coordinate (주변 차량 위치 좌표의 고속 클러스터링을 위한 휴리스틱 알고리즘)

Choi, Yoon-Ho;Yoo, Seung-Ho;Seo, Seung-Woo
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.39C no.4
- /
- pp.343-350
- /
- 2014
Divisive hierarchical clustering algorithms iterate the process of decomposition and clustering data recursively. In each recursive call, data in each cluster are arbitrarily selected and thus, the total clustering time can be increased, which causes a problem that it is difficult to apply the process of clustering neighbor vehicular position data in vehicular localization. In this paper, we propose a new heuristic algorithm for speeding up the clustering time by eliminating randomness of the selected data in the process of generating the initial divisive clusters.
https://doi.org/10.7840/kics.2014.39C.4.343 인용 PDF KSCI

EXTENDED ONLINE DIVISIVE AGGLOMERATIVE CLUSTERING

Musa, Ibrahim Musa Ishag;Lee, Dong-Gyu;Ryu, Keun-Ho
- Proceedings of the KSRS Conference
- /
- 2008.10a
- /
- pp.406-409
- /
- 2008
Clustering data streams has an importance over many applications like sensor networks. Existing hierarchical methods follow a semi fuzzy clustering that yields duplicate clusters. In order to solve the problems, we propose an extended online divisive agglomerative clustering on data streams. It builds a tree-like top-down hierarchy of clusters that evolves with data streams using geometric time frame for snapshots. It is an enhancement of the Online Divisive Agglomerative Clustering (ODAC) with a pruning strategy to avoid duplicate clusters. Our main features are providing update time and memory space which is independent of the number of examples on data streams. It can be utilized for clustering sensor data and network monitoring as well as web click streams.
PDF

An Agglomerative Hierarchical Variable-Clustering Method Based on a Correlation Matrix

Lee, Kwangjin
- Communications for Statistical Applications and Methods
- /
- v.10 no.2
- /
- pp.387-397
- /
- 2003
Generally, most of researches that need a variable-clustering process use an exploratory factor analysis technique or a divisive hierarchical variable-clustering method based on a correlation matrix. And some researchers apply a object-clustering method to a distance matrix transformed from a correlation matrix, though this approach is known to be improper. On this paper an agglomerative hierarchical variable-clustering method based on a correlation matrix itself is suggested. It is derived from a geometric concept by using variate-spaces and a characterizing variate.
https://doi.org/10.5351/CKSS.2003.10.2.387 인용 PDF KSCI

A Hashing Method Using PCA-based Clustering (PCA 기반 군집화를 이용한 해슁 기법)

Park, Cheong Hee
- KIPS Transactions on Software and Data Engineering
- /
- v.3 no.6
- /
- pp.215-218
- /
- 2014
In hashing-based methods for approximate nearest neighbors(ANN) search, by mapping data points to k-bit binary codes, nearest neighbors are searched in a binary embedding space. In this paper, we present a hashing method using a PCA-based clustering method, Principal Direction Divisive Partitioning(PDDP). PDDP is a clustering method which repeatedly partitions the cluster with the largest variance into two clusters by using the first principal direction. The proposed hashing method utilizes the first principal direction as a projective direction for binary coding. Experimental results demonstrate that the proposed method is competitive compared with other hashing methods.
https://doi.org/10.3745/KTSDE.2014.3.6.215 인용 PDF KSCI

A Survey of Advances in Hierarchical Clustering Algorithms and Applications

Munshi, Amr
- International Journal of Computer Science & Network Security
- /
- v.22 no.5
- /
- pp.17-24
- /
- 2022
Hierarchical clustering methods have been proposed for more than sixty years and yet are used in various disciplines for relation observation and clustering purposes. In 1965, divisive hierarchical methods were proposed in biological sciences and have been used in various disciplines such as, and anthropology, ecology. Furthermore, recently hierarchical methods are being deployed in economy and energy studies. Unlike most clustering algorithms that require the number of clusters to be specified by the user, hierarchical clustering is well suited for situations where the number of clusters is unknown. This paper presents an overview of the hierarchical clustering algorithm. The dissimilarity measurements that can be utilized in hierarchical clustering algorithms are discussed. Further, the paper highlights the various and recent disciplines where the hierarchical clustering algorithms are employed.
https://doi.org/10.22937/IJCSNS.2022.22.5.4 인용 PDF KSCI

A Divisive Clustering for Mixed Feature-Type Symbolic Data (혼합형태 심볼릭 데이터의 군집분석방법)

Kim, Jaejik
- The Korean Journal of Applied Statistics
- /
- v.28 no.6
- /
- pp.1147-1161
- /
- 2015
Nowadays we are considering and analyzing not only classical data expressed by points in the p-dimensional Euclidean space but also new types of data such as signals, functions, images, and shapes, etc. Symbolic data also can be considered as one of those new types of data. Symbolic data can have various formats such as intervals, histograms, lists, tables, distributions, models, and the like. Up to date, symbolic data studies have mainly focused on individual formats of symbolic data. In this study, it is extended into datasets with both histogram and multimodal-valued data and a divisive clustering method for the mixed feature-type symbolic data is introduced and it is applied to the analysis of industrial accident data.
https://doi.org/10.5351/KJAS.2015.28.6.1147 인용 PDF KSCI

An Interactive e-HealthCare Framework Utilizing Online Hierarchical Clustering Method (온라인 계층적 군집화 기법을 활용한 양방향 헬스케어 프레임워크)

Musa, Ibrahim Musa Ishag;Jung, Sukho;Shin, DongMun;Yi, Gyeong Min;Lee, Dong Gyu;Sohn, Gyoyong;Ryu, Keun Ho
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.04a
- /
- pp.399-400
- /
- 2009
As a part of the era of human centric applications people started to care about their well being utilizing any possible mean. This paper proposes a framework for real time on-body sensor health-care system, addresses the current issues in such systems, and utilizes an enhanced online divisive agglomerative clustering algorithm (EODAC); an algorithm that builds a top-down tree-like structure of clusters that evolves with streaming data to rationally cluster on-body sensor data and give accurate diagnoses remotely, guaranteeing high performance, and scalability. Furthermore it does not depend on the number of data points.
https://doi.org/10.3745/PKIPS.y2009m04a.399 인용 PDF

A Comparative Study on the Agglomerative and Divisive Methods for Hierarchical Document Clustering (계층적 문서 클러스터링을 위한 응집식 기법과 분할식 기법의 비교 연구)

Lee, Jae-Yun;Jeong, Jin-Ah
- Proceedings of the Korean Society for Information Management Conference
- /
- 2005.08a
- /
- pp.65-70
- /
- 2005
계층적 문서 클러스터링에 있어서 실험집단에 따라 응집식 기법과 분할식 기법의 성능이 다르며, 이를 좌우하는 요소는 분류의 깊이, 즉 분류수준이라고 가정하였다. 조금만 나누면 되는 대분류인 경우는 상대적으로 분할식 기법이 유리하고, 조금만 합치면 되는 소분류인 경우에는 응집식 기법이 유리할 것이라고 판단했기 때문이다. 그에 따라 분할식 클러스터링 기법인 양분(Bisecting) K-means기법과 응집식 기법인 완전연결, 평균연결, WARD기법의 성능을 실험집단이 대분류인 경우와 소분류인 경우의 유사계수를 적용하여 각 기법별 성능을 비교하여 실험집단의 특성에 따른 적합 클러스터링 기법을 찾고자 하였다. 실험결과 응집식 기법과 분할식 기법의 성능 우열에 영향을 미치는 것은 분류수준보다는 변이계수로 측정된 상대적인 군집의 크기 편차인 것으로 나타났다.
PDF

Search Result 9, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)