• Title/Summary/Keyword: K-medoid

Search Result 6, Processing Time 0.021 seconds

Optimal k-search and Its Application in k-medoid Clustering Algorithm based on Genetic Algorithm (유전자 알고리즘에 기반한 K-medoid 클러스터링 알고리즘에서의 최적의 k-탐색과 적용)

  • Ahn Sun-Young;Yoon Hye-Sung;Lee Sang-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.55-57
    • /
    • 2006
  • k-medoid 클러스터링 알고리즘은 고정된 클러스터 수(k)를 가지고 실험하기 때문에 데이터에 대한 사전 지식이 없으면 올바른 분석이 어렵고, 클러스터 수를 변경하면서 여러 번 반복 실험하여 실험 결과에 대한 타당성을 조사해야 하기 때문에 데이터의 크기가 커질수록 시간 비용이 증가하는 단점이 생긴다. 본 논문에서는 k-medoid 클러스터링 알고리즘 분석에 있어서 가장 어려운 문제 중 하나인 적절한 클러스터 수 k를 사회 네트워크 분석 방법 중 매개중심 값을 이용하여 찾는 새로운 방법을 제안하고 이를 실제 마이크로 어레이 데이터에 적용하여 유전자 알고리즘에 기반한 k-medoid 클러스터링을 수행함으로써 좀 더 정확한 클러스터링 결과를 보인다.

  • PDF

Changes in Korea Steel Industry and Formation Process of Technology-knowledge network (한국 철강산업 변화와 기술지식 네트워크 형성 과정)

  • Park, Sohyun
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.474-490
    • /
    • 2016
  • This paper investigates how Kora steel industry has experienced technological diversification, organizational flexibility, and geographical dispersion, and analyzes how technology-knowledge network has formed. The network is constructed using mutual patent data. K-medoid clustering and brokerage analysis are applied. The results indicate actors in network are diversified and links between those who belong to the same cluster get stronger. Network formation reflects affiliation, competition, and cooperation in the industry, and brokerage roles of conglomerates, research institute, and small and medium sized companies are detected.

  • PDF

Hydrological homogeneous region delineation for bivariate frequency analysis of extreme rainfalls in Korea (다변량 L-moment를 이용한 이변량 강우빈도해석에서 수문학적 동질지역 선정)

  • Shin, Ju-Young;Jeong, Changsam;Joo, Kyungwon;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.1
    • /
    • pp.49-60
    • /
    • 2018
  • The multivariate regional frequency analysis has many advantages such as an adaption of regional parameters and consideration of a correlated structure of the data. The multivariate regional frequency analysis can provide the broader and more detailed information for the hydrological variables. The multivariate regional frequency analysis has not been attempted to model hydrological variables in South Korea yet. Therefore, it is required to investigate the applicability of the multivariate regional frequency analysis in the modeling of the hydrological variables. The current study investigated the applicability of the homogeneous region delineation and their characteristics in bivariate regional frequency analysis of annual maximum rainfall depth-duration data. The K-medoid method was employed as a clustering method. The discordancy and heterogeneous measures were used to assess the appropriateness of the delineation results. According to the results of the clustering analysis, the employed stations could be grouped into five regions. All stations at three of the five regions led to acceptable values of discordancy measures than the threshold. The stations where have short record length led to the large discordancy measures. All grouped regions were identified as a homogeneous region based on heterogeneous measure estimates. It was observed that there are strong cross-correlations among the stations in the same region.

Medoid Determination in Deterministic Annealing-based Pairwise Clustering

  • Lee, Kyung-Mi;Lee, Keon-Myung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.11 no.3
    • /
    • pp.178-183
    • /
    • 2011
  • The deterministic annealing-based clustering algorithm is an EM-based algorithm which behaves like simulated annealing method, yet less sensitive to the initialization of parameters. Pairwise clustering is a kind of clustering technique to perform clustering with inter-entity distance information but not enforcing to have detailed attribute information. The pairwise deterministic annealing-based clustering algorithm repeatedly alternates the steps of estimation of mean-fields and the update of membership degrees of data objects to clusters until termination condition holds. Lacking of attribute value information, pairwise clustering algorithms do not explicitly determine the centroids or medoids of clusters in the course of clustering process or at the end of the process. This paper proposes a method to identify the medoids as the centers of formed clusters for the pairwise deterministic annealing-based clustering algorithm. Experimental results show that the proposed method locate meaningful medoids.

Two-Phase Algorithm for Determining the Number and the Locations of RBF Centers (RBF 네트웍의 중심 개수와 위치의 통합 결정을 위한 Two-Phase 알고리즘)

  • 이대원;이재욱
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.827-834
    • /
    • 2003
  • 기존의 RBF네트웍의 중심 결정에 관한 연구에서는 은닉중의 노드 수(즉 중심의 개수)가 결정되었다는 가정하에 그 위치만을 결정하는 알고리즘들이 개발되었다. 그러나 RBF 네트웍 의 성능과 계산속도는 중심의 개수에도 민감하기 때문에, 중심 위치와 개수의 통합적인 고려가 필요하다. 본 논문에서는 RBF 네트웍의 중심결정에 있어서 그 위치 뿐만 아니라 개수까지 동시에 고려하는 Two-Phase 알고리즘을 제안한다. Two-Phase 알고리즘은 두 단계로 구성된다 찻 번째 단계에서는 Bi-section 방법과 보정된 k-medoid 군집화 기법을 이용하여 네트웍의 최소 중심 개수와 위치를 결정한다. 두번째 단계에서는 RBF 네트웍의 weight를 결정하고 네트웍 설계를 마친다. 제안된 알고리즘을 다양한 수지 예제에 적용한 결과, 중심결정에 관한 기존의 알고리즘에 비해 더 적은 수의 중심으로 더 정확한 예측성능을 보임을 알 수 있었다.

  • PDF

Partial Dimensional Clustering based on Projection Filtering in High Dimensional Data Space (대용량의 고차원 데이터 공간에서 프로젝션 필터링 기반의 부분차원 클러스터링 기법)

  • 이혜명;정종진
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.4
    • /
    • pp.69-88
    • /
    • 2003
  • In high dimensional data, most of clustering algorithms tend to degrade the performance rapidly because of nature of sparsity and amount of noise. Recently, partial dimensional clustering algorithms have been studied, which have good performance in clustering. These algorithms select the dimensional data closely related to clustering but discard the dimensional data which are not directly related to clustering in entire dimensional data. However, the traditional algorithms have some problems. At first, the algorithms employ grid based techniques but the large amount of grids make worse the performance of algorithm in terms of computational time and memory space. Secondly, the algorithms explore dimensions related to clustering using k-medoid but it is very difficult to determine the best quality of k-medoids in large amount of high dimensional data. In this paper, we propose an efficient partial dimensional clustering algorithm which is called CLIP. CLIP explores dense regions for cluster on a certain dimension. Then, the algorithm probes dense regions on a next dimension. dependent on the dense regions of the explored dimension using incremental projection. CLIP repeats these probing work in all dimensions. Clustering by Incremental projection can prune the search space largely and reduce the computational time considerably. We evaluate the performance(efficiency, effectiveness and accuracy, etc.) of the proposed algorithm compared with other algorithms using common synthetic data.

  • PDF