• Title/Summary/Keyword: divisive

Search Result 19, Processing Time 0.083 seconds

Double monothetic clustering for histogram-valued data

  • Kim, Jaejik;Billard, L.
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.3
    • /
    • pp.263-274
    • /
    • 2018
  • One of the common issues in large dataset analyses is to detect and construct homogeneous groups of objects in those datasets. This is typically done by some form of clustering technique. In this study, we present a divisive hierarchical clustering method for two monothetic characteristics of histogram data. Unlike classical data points, a histogram has internal variation of itself as well as location information. However, to find the optimal bipartition, existing divisive monothetic clustering methods for histogram data consider only location information as a monothetic characteristic and they cannot distinguish histograms with the same location but different internal variations. Thus, a divisive clustering method considering both location and internal variation of histograms is proposed in this study. The method has an advantage in interpreting clustering outcomes by providing binary questions for each split. The proposed clustering method is verified through a simulation study and applied to a large U.S. house property value dataset.

Heuristic Algorithm for High-Speed Clustering of Neighbor Vehicular Position Coordinate (주변 차량 위치 좌표의 고속 클러스터링을 위한 휴리스틱 알고리즘)

  • Choi, Yoon-Ho;Yoo, Seung-Ho;Seo, Seung-Woo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39C no.4
    • /
    • pp.343-350
    • /
    • 2014
  • Divisive hierarchical clustering algorithms iterate the process of decomposition and clustering data recursively. In each recursive call, data in each cluster are arbitrarily selected and thus, the total clustering time can be increased, which causes a problem that it is difficult to apply the process of clustering neighbor vehicular position data in vehicular localization. In this paper, we propose a new heuristic algorithm for speeding up the clustering time by eliminating randomness of the selected data in the process of generating the initial divisive clusters.

EXTENDED ONLINE DIVISIVE AGGLOMERATIVE CLUSTERING

  • Musa, Ibrahim Musa Ishag;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.406-409
    • /
    • 2008
  • Clustering data streams has an importance over many applications like sensor networks. Existing hierarchical methods follow a semi fuzzy clustering that yields duplicate clusters. In order to solve the problems, we propose an extended online divisive agglomerative clustering on data streams. It builds a tree-like top-down hierarchy of clusters that evolves with data streams using geometric time frame for snapshots. It is an enhancement of the Online Divisive Agglomerative Clustering (ODAC) with a pruning strategy to avoid duplicate clusters. Our main features are providing update time and memory space which is independent of the number of examples on data streams. It can be utilized for clustering sensor data and network monitoring as well as web click streams.

  • PDF

A Divisive Clustering for Mixed Feature-Type Symbolic Data (혼합형태 심볼릭 데이터의 군집분석방법)

  • Kim, Jaejik
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.6
    • /
    • pp.1147-1161
    • /
    • 2015
  • Nowadays we are considering and analyzing not only classical data expressed by points in the p-dimensional Euclidean space but also new types of data such as signals, functions, images, and shapes, etc. Symbolic data also can be considered as one of those new types of data. Symbolic data can have various formats such as intervals, histograms, lists, tables, distributions, models, and the like. Up to date, symbolic data studies have mainly focused on individual formats of symbolic data. In this study, it is extended into datasets with both histogram and multimodal-valued data and a divisive clustering method for the mixed feature-type symbolic data is introduced and it is applied to the analysis of industrial accident data.

Parts grouping by a hierarchical divisive algorithm and machine cell formation (계층 분리 알고리즘에 의한 부품 그룹핑 및 셀 구성)

  • Lee, Choon-Shik;Hwang, Hark
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10a
    • /
    • pp.589-594
    • /
    • 1991
  • Group Technology (GT) is a technique for identifying and bringing together related or similar components in a production process in order to take advantage of their similarities by making use of, for example, the inherent economies of flow production methods. The process of identification, from large variety and total of components, of the part families requiring similar manufacturing operations and forming the associated groups of machines is referred as 'machine-component grouping'. First part of this paper is devoted to describing a hierarchical divisive algorithm based on graph theory to find the natural part families. The objective is to form components into part families such that the degree of inter-relations is high among components within the same part family and low between components of different part families. Second part of this paper focuses on establishing cell design procedures. The aim is to create cells in which the most expensive and important machines-called key machine - have a reasonably high utilization and the machines should be allocated to minimize the intercell movement of machine loads. To fulfil the above objectives, 0-1 integer programming model is developed and the solution procedures are found. Next an attempt is made to test the feasibility of the proposed method. Several different problems appearing in the literature are chosen and the results air briefly showed.

  • PDF

A Hashing Method Using PCA-based Clustering (PCA 기반 군집화를 이용한 해슁 기법)

  • Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.6
    • /
    • pp.215-218
    • /
    • 2014
  • In hashing-based methods for approximate nearest neighbors(ANN) search, by mapping data points to k-bit binary codes, nearest neighbors are searched in a binary embedding space. In this paper, we present a hashing method using a PCA-based clustering method, Principal Direction Divisive Partitioning(PDDP). PDDP is a clustering method which repeatedly partitions the cluster with the largest variance into two clusters by using the first principal direction. The proposed hashing method utilizes the first principal direction as a projective direction for binary coding. Experimental results demonstrate that the proposed method is competitive compared with other hashing methods.

An Agglomerative Hierarchical Variable-Clustering Method Based on a Correlation Matrix

  • Lee, Kwangjin
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.387-397
    • /
    • 2003
  • Generally, most of researches that need a variable-clustering process use an exploratory factor analysis technique or a divisive hierarchical variable-clustering method based on a correlation matrix. And some researchers apply a object-clustering method to a distance matrix transformed from a correlation matrix, though this approach is known to be improper. On this paper an agglomerative hierarchical variable-clustering method based on a correlation matrix itself is suggested. It is derived from a geometric concept by using variate-spaces and a characterizing variate.

Nematode Fauna of High Altitude Avian Hosts in garhwal Himalayan Ecosystems I. eustrongylides spinispiculum n. sp. and Revised Key to the Species of Genus Eustrongylides Jagerskiold (1909)

  • Rautela, Anand S.;Malhotra, Sandeep K.
    • Parasites, Hosts and Diseases
    • /
    • v.22 no.2
    • /
    • pp.242-247
    • /
    • 1984
  • Analysis of variance has been applied as a new tool for precise substantiation of tabometric differences between Eustrongylides spinispiculum n. sp. and close species, as indicated by the polythetic divisive classificatory system. A revised key to the species of genus Eustrongylides Jagerski61d (1909) has been Presented.

  • PDF

A Comparative Study on the Agglomerative and Divisive Methods for Hierarchical Document Clustering (계층적 문서 클러스터링을 위한 응집식 기법과 분할식 기법의 비교 연구)

  • Lee, Jae-Yun;Jeong, Jin-Ah
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2005.08a
    • /
    • pp.65-70
    • /
    • 2005
  • 계층적 문서 클러스터링에 있어서 실험집단에 따라 응집식 기법과 분할식 기법의 성능이 다르며, 이를 좌우하는 요소는 분류의 깊이, 즉 분류수준이라고 가정하였다. 조금만 나누면 되는 대분류인 경우는 상대적으로 분할식 기법이 유리하고, 조금만 합치면 되는 소분류인 경우에는 응집식 기법이 유리할 것이라고 판단했기 때문이다. 그에 따라 분할식 클러스터링 기법인 양분(Bisecting) K-means기법과 응집식 기법인 완전연결, 평균연결, WARD기법의 성능을 실험집단이 대분류인 경우와 소분류인 경우의 유사계수를 적용하여 각 기법별 성능을 비교하여 실험집단의 특성에 따른 적합 클러스터링 기법을 찾고자 하였다. 실험결과 응집식 기법과 분할식 기법의 성능 우열에 영향을 미치는 것은 분류수준보다는 변이계수로 측정된 상대적인 군집의 크기 편차인 것으로 나타났다.

  • PDF

A study on nonlinear transform layers in neural networks for image compression (정지영상 압축을 위한 인공신경망 내 비선형 변환 계층 분석)

  • Lee, Jooyoung;Cho, Seunghyun;Kim, Hui Yong;Choi, Jin Soo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.06a
    • /
    • pp.267-269
    • /
    • 2018
  • 인공신경망의 확산 및 보급에 따라 적용 영역이 확대되고 있으며 여러 분야에서 획기적인 성능 향상을 이루고 있다. 영상 압축 분야의 기술개발은 기존 코덱 구조 내 각 요소기술의 성능향상을 위한 인공신경망 기술 분야와 기존 코덱 구조가 아닌 end-to-end 학습을 통한 인공신경망 기반 기술 분야로 나뉘어 진행되고 있다. 본 논문에서는 end-to-end 학습을 통한 인공신경망 기술의 비선형 변환 계층 중 GDN(generalized divisive normalization) 계층이 영상 압축에 미치는 영향을 분석한다.

  • PDF