• Title/Summary/Keyword: K-means 군집화

Search Result 274, Processing Time 0.027 seconds

Identifying the Optimal Number of Homogeneous Regions for Regional Frequency Analysis Using Self-Organizing Map (자기조직화지도를 활용한 동일강수지역 최적군집수 분석)

  • Kim, Hyun Uk;Sohn, Chul;Han, Sang-Ok
    • Spatial Information Research
    • /
    • v.20 no.6
    • /
    • pp.13-21
    • /
    • 2012
  • In this study, homogeneous regions for regional frequency analysis were identified using rainfall data from 61 observation points in Korea. The used data were gathered from 1980 to 2010. Self organizing map and K-means clustering based on Davies-Bouldin Index were used to make clusters showing similar rainfall patterns and to decide the optimum number of the homogeneous regions. The results from this analysis showed that the 61 observation points can be optimally grouped into 6 geographical clusters. Finally, the 61 observations points grouped into 6 clusters were mapped regionally using Thiessen polygon method.

A Study of Similarity Measure Algorithms for Recomendation System about the PET Food (반려동물 사료 추천시스템을 위한 유사성 측정 알고리즘에 대한 연구)

  • Kim, Sam-Taek
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.11
    • /
    • pp.159-164
    • /
    • 2019
  • Recent developments in ICT technology have increased interest in the care and health of pets such as dogs and cats. In this paper, cluster analysis was performed based on the component data of pet food to be used in various fields of the pet industry. For cluster analysis, the similarity was analyzed by analyzing the correlation between components of 300 dogs and cats in the market. In this paper, clustering techniques such as Hierarchical, K-Means, Partitioning around medoids (PAM), Density-based, Mean-Shift are clustered and analyzed. We also propose a personalized recommendation system for pets. The results of this paper can be used for personalized services such as feed recommendation system for pets.

Magnifying Block Diagonal Structure for Spectral Clustering (스펙트럼 군집화에서 블록 대각 형태의 유사도 행렬 구성)

  • Heo, Gyeong-Yong;Kim, Kwang-Baek;Woo, Young-Woon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.9
    • /
    • pp.1302-1309
    • /
    • 2008
  • Traditional clustering methods, like k-means or fuzzy clustering, are prototype-based methods which are applicable only to convex clusters. On the other hand, spectral clustering tries to find clusters only using local similarity information. Its ability to handle concave clusters has gained the popularity recent years together with support vector machine (SVM) which is a kernel-based classification method. However, as is in SVM, the kernel width plays an important role and has a great impact on the result. Several methods are proposed to decide it automatically, it is still determined based on heuristics. In this paper, we proposed an adaptive method deciding the kernel width based on distance histogram. The proposed method is motivated by the fact that the affinity matrix should be formed into a block diagonal matrix to generate the best result. We use the tradition Euclidean distance together with the random walk distance, which make it possible to form a more apparent block diagonal affinity matrix. Experimental results show that the proposed method generates more clear block structured affinity matrix than the existing one does.

  • PDF

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

  • Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1077-1094
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.

Selection of An Initial Training Set for Active Learning Using Cluster-Based Sampling (능동적 학습을 위한 군집기반 초기훈련집합 선정)

  • 강재호;류광렬;권혁철
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.7
    • /
    • pp.859-868
    • /
    • 2004
  • We propose a method of selecting initial training examples for active learning so that it can reach high accuracy faster with fewer further queries. Our method is based on the assumption that an active learner can reach higher performance when given an initial training set consisting of diverse and typical examples rather than similar and special ones. To obtain a good initial training set, we first cluster examples by using k-means clustering algorithm to find groups of similar examples. Then, a representative example, which is the closest example to the cluster's centroid, is selected from each cluster. After these representative examples are labeled by querying to the user for their categories, they can be used as initial training examples. We also suggest a method of using the centroids as initial training examples by labeling them with categories of corresponding representative examples. Experiments with various text data sets have shown that the active learner starting from the initial training set selected by our method reaches higher accuracy faster than that starting from randomly generated initial training set.

A Image Contrast Enhancement Using Clustering of Image Histogram (히스토그램 군집화를 이용한 영상 대비 향상)

  • Hong, Seok-Keun;Park, Joon-Woo;Kang, Byeong-Jo;Choi, Yu-Na;Cho, Seok-Je
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.379-380
    • /
    • 2009
  • 히스토그램 스트레칭이나 히스토그램 균등화 등 기존 대비 향상 기법들과 히스토그램 균등화 기반의 수많은 방법들은 저대비에 소수의 화소들이 넓게 퍼져 있는 영상에 대해서 만족할만한 결과를 내지 못한다. 따라서 본 논문은 군집화 방법을 이용한 새로운 영상 대비 향상 기법을 제안한다. 히스토그램의 군집수는 원영상의 히스토그램을 분석하여 얻을 수 있다. 히스토그램 성분들을 K-means 알고리즘을 이용하여 군집화한다. 그리고 히스토그램 군집 범위와 군집의 화소수 비율을 비교하여 히스토그램 스트레칭과 히스토그램 균등화를 선택적으로 적용한다. 실험 결과로부터 제안한 방법이 기존의 대비 향상 기법들보다 더 효과적임을 확인할 수 있었다.

Analysis of Apartment Power Consumption and Forecast of Power Consumption Based on Deep Learning (공동주택 전력 소비 데이터 분석 및 딥러닝을 사용한 전력 소비 예측)

  • Yoo, Namjo;Lee, Eunae;Chung, Beom Jin;Kim, Dong Sik
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1373-1380
    • /
    • 2019
  • In order to increase energy efficiency, developments of the advanced metering infrastructure (AMI) in the smart grid technology have recently been actively conducted. An essential part of AMI is analyzing power consumption and forecasting consumption patterns. In this paper, we analyze the power consumption and summarized the data errors. Monthly power consumption patterns are also analyzed using the k-means clustering algorithm. Forecasting the consumption pattern by each household is difficult. Therefore, we first classify the data into 100 clusters and then predict the average of the next day as the daily average of the clusters based on the deep neural network. Using practically collected AMI data, we analyzed the data errors and could successfully conducted power forecasting based on a clustering technique.

Selection of Optimal Variables for Clustering of Seoul using Genetic Algorithm (유전자 알고리즘을 이용한 서울시 군집화 최적 변수 선정)

  • Kim, Hyung Jin;Jung, Jae Hoon;Lee, Jung Bin;Kim, Sang Min;Heo, Joon
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.175-181
    • /
    • 2014
  • Korean government proposed a new initiative 'government 3.0' with which the administration will open its dataset to the public before requests. City of Seoul is the front runner in disclosure of government data. If we know what kind of attributes are governing factors for any given segmentation, these outcomes can be applied to real world problems of marketing and business strategy, and administrative decision makings. However, with respect to city of Seoul, selection of optimal variables from the open dataset up to several thousands of attributes would require a humongous amount of computation time because it might require a combinatorial optimization while maximizing dissimilarity measures between clusters. In this study, we acquired 718 attribute dataset from Statistics Korea and conducted an analysis to select the most suitable variables, which differentiate Gangnam from other districts, using the Genetic algorithm and Dunn's index. Also, we utilized the Microsoft Azure cloud computing system to speed up the process time. As the result, the optimal 28 variables were finally selected, and the validation result showed that those 28 variables effectively group the Gangnam from other districts using the Ward's minimum variance and K-means algorithm.

Analysis of Ship Investment Patterns Using Clustering between Greece and Korea (군집화 분석을 활용한 선박투자패턴 분석: 그리스와 한국 사례 중심으로)

  • Lim, Sangseop;Kim, Seok-Hun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.707-708
    • /
    • 2021
  • 선박은 해운시장에서 가장 중요한 자산이다. 이러한 선박투자에는 대규모 자본조달이 필요하며 시황 및 경기분석을 통해 고점투자를 방지하고 조달비용을 절감하는 것이 중요하며 이러한 결정이 투자 성패를 좌우한다. 본 논문은 K평균 군집화분석을 이용하여 그리스 선주와 한국 선주의 선박투자행태를 분류하고자 한다. 분석의 결과로 선박투자의 주요 요인들을 식별하여 기업차원의 선박투자의 벤티마크 투자전략을 수립하는데 기여하고자 하며 정책적 차원에서 선박투자에 필요한 전략에 대한 시사점을 도출하고자 한다.

  • PDF

Adaptive Edge Detection Using Histogram Equalization and Clustering (히스토그램 평활화와 군집화 전처리를 통한 적응적 경계선 추출 방법)

  • Choi, Jinjung;Lee, Jeonghyun;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.11a
    • /
    • pp.84-87
    • /
    • 2017
  • 주변 픽셀간의 명도 차이가 작을수록 같은 경계를 구성하고 있을 가능성이 크다. 따라서 주변 픽셀간의 명도를 고려하여 경계 추출기를 활용한다면 보다 정확한 경계선 추출이 가능하다. 하지만 한가지의 히스토그램 평활화와 k-means 군집화를 사용하는 기존 알고리듬은 평활화에 의한 이미지 왜곡이나, 명도 차이가 큰 픽셀이 같은 그룹에 속하는 경우 혹은 명도 차이가 작은 픽셀이 각각 다른 그룹에 속하는 경우와 같이 그룹화의 오류가 있기 때문에 원본 이미지에 없던 불필요한 경계선이 발견되었다. 본 논문은 하나의 이미지에 대해서 여러 가지 히스토그램 평활화 방법으로 각각 다른 명도 분포를 얻어내어 적응적으로 경계선을 판단하는 알고리듬을 제안한다. 이는 기존 알고리듬에서 나타나는 불필요한 경계선을 제거하였으며 기본 경계 추출기의 효과를 향상시켰다.

  • PDF