• 제목/요약/키워드: k-means Algorithm

검색결과 1,367건 처리시간 0.033초

A K-means-like Algorithm for K-medoids Clustering

  • 이종석;박해상;전치혁
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 2005년도 추계학술대회 및 정기총회
    • /
    • pp.51-54
    • /
    • 2005
  • Clustering analysis is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes. In this paper we propose a new algorithm for K-medoids clustering which runs like the K-means algorithm. The new algorithm calculates distance matrix once and uses it for finding new medoids at every iterative step. We evaluate the proposed method using real and synthetic data and compare with the results of other algorithms. The proposed algorithm takes reduced time in computation and better performance than others.

  • PDF

맵리듀스를 이용한 다중 중심점 집합 기반의 효율적인 클러스터링 방법 (An Efficient Clustering Method based on Multi Centroid Set using MapReduce)

  • 강성민;이석주;민준기
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제21권7호
    • /
    • pp.494-499
    • /
    • 2015
  • 데이터 사이즈가 증가함에 따라서 대용량 데이터를 분석하여 데이터의 특성을 파악하는 것이 매우 중요해졌다. 본 논문에서는 분산 병렬 처리 프레임워크인 맵리듀스를 활용한 k-Means 클러스터링 기반의 효과적인 클러스터링 기법인 MCSK-Means (Multi centroid set k-Means)알고리즘을 제안한다. k-Means 알고리즘은 임의로 정해지는 k개의 초기 중심점들의 위치에 따라서 클러스터링 결과의 정확도가 많은 영향을 받는 문제점을 가지고 있다. 이러한 문제를 해결하기 위하여, 본 논문에서 제안하는 MCSK-Means 알고리즘은 k개의 중심점들로 이루어진 m개의 중심점 집합을 사용하여 임의로 생성되는 초기 중심점의 의존도를 줄였다. 또한, 클러스터링 단계를 거친 m개의 중심점 집합들에 속한 중심점들에 대하여 직접 계층 클러스터링 알고리즘을 적용하여 k개의 클러스터 중심점들을 생성하였다. 본 논문에서는 MCSK-Means 알고리즘을 맵리듀스 프레임워크 환경에서 개발하여 대용량 데이터를 효율적으로 처리할 수 있도록 하였다.

A Comparison of the Rudin-Osher-Fatemi Total Variation model and the Nonlocal Means Algorithm

  • ;최흥국
    • 한국멀티미디어학회:학술대회논문집
    • /
    • 한국멀티미디어학회 2012년도 춘계학술발표대회논문집
    • /
    • pp.6-9
    • /
    • 2012
  • In this study, we compare two image denoising methods which are the Rudin-Osher-Fatemi total variation (TV) model and the nonlocal means (NLM) algorithm on medical images. To evaluate those methods, we used two well known measuring metrics. The methods are tested with a CT image, one X-Ray image, and three MRI images. Experimental result shows that the NML algorithm can give better results than the ROF TV model, but computational complexity is high.

  • PDF

Fuzzy c-Means Clustering Algorithm with Pseudo Mahalanobis Distances

  • ICHIHASHI, Hidetomo;OHUE, Masayuki;MIYOSHI, Tetsuya
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 1998년도 The Third Asian Fuzzy Systems Symposium
    • /
    • pp.148-152
    • /
    • 1998
  • Gustafson and Kessel proposed a modified fuzzy c-Means algorithm based of the Mahalanobis distance. Though the algorithm appears more natural through the use of a fuzzy covariance matrix, it needs to calculate determinants and inverses of the c-fuzzy scatter matrices. This paper proposes a fuzzy clustering algorithm using pseudo mahalanobis distance, which is more easy to use and flexible than the Gustafson and Kessel's fuzzy c-Means.

  • PDF

군집기반 열간조압연설비 상태모니터링과 진단 (Clustering-based Monitoring and Fault detection in Hot Strip Roughing Mill)

  • 서명교;윤원영
    • 품질경영학회지
    • /
    • 제45권1호
    • /
    • pp.25-38
    • /
    • 2017
  • Purpose: Hot strip rolling mill consists of a lot of mechanical and electrical units. In condition monitoring and diagnosis phase, various units could be failed with unknown reasons. In this study, we propose an effective method to detect early the units with abnormal status to minimize system downtime. Methods: The early warning problem with various units is defined. K-means and PAM algorithm with Euclidean and Manhattan distances were performed to detect the abnormal status. In addition, an performance of the proposed algorithm is investigated by field data analysis. Results: PAM with Manhattan distance(PAM_ManD) showed better results than K-means algorithm with Euclidean distance(K-means_ED). In addition, we could know from multivariate field data analysis that the system reliability of hot strip rolling mill can be increased by detecting early abnormal status. Conclusion: In this paper, clustering-based monitoring and fault detection algorithm using Manhattan distance is proposed. Experiments are performed to study the benefit of the PAM with Manhattan distance against the K-means with Euclidean distance.

Fuzzy modeling using HPC-MEANS algorhthm and genetic algorithm

  • Ryu, Kye-Won;Lee, Won-Gyu;Kim, Seong-Hwan;Noh, Heung-Sik;Park, Mignon
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1994년도 Proceedings of the Korea Automatic Control Conference, 9th (KACC) ; Taejeon, Korea; 17-20 Oct. 1994
    • /
    • pp.113-116
    • /
    • 1994
  • In this paper. we suggest new fuzzy modeling algorithm, which can be easily implemented, by combining HPC-MEANS Algorithm and Genetic Algorithm. HPC-MEANS used to cluster the sample data in input-output space will hyper planes and to make structure identification roughly and Genetic Algorithm is used to nine the premise and consequent parameters. For the validity of suggested methods we model the system with I/O data from known system. and then compare two systems.

  • PDF

개선된 k-means 알고리즘을 적용한 사용자 특성 선호도 추천 시스템 (User's Individuality Preference Recommendation System using Improved k-means Algorithm)

  • 안찬식;오상엽
    • 한국컴퓨터정보학회논문지
    • /
    • 제15권8호
    • /
    • pp.141-148
    • /
    • 2010
  • 모바일 단말기에서 사용자의 상황을 고려하고 사용자의 취향이나 특성을 반영하여 정보를 찾아주거나 추천하는 서비스 시스템은 개념적인 정보만을 제한적으로 추천한다. 또한 사용자의 특성에 따른 정보 선호도를 제공하지 않으므로 정확한 정보 추천의 어려운 단점이 있다. 따라서 본 논문에서는 사용자 특성에 따른 선호도를 고려하여 정확한 상황 정보를 추천 할 수 있는 개선된 k-means 알고리즘을 적용하여 사용자 특성에 따른 선호도 추천 시스템을 제안하였다. 본 연구에서는 사용자 특성에 따른 선호도를 상관 계수를 이용하여 구하고 사용자의 특성 선호도를 개선된 k-means 알고리즘을 이용하여 추천하였다. 제한적인 개념의 정보만을 제공하던 시스템에서 사용자의 특성에 따른 정보 선호도를 제공하여 정확한 정보를 추천하므로 제한된 정보 추천의 단점을 해결하였다. 성능 실험은 기존의 서비스 시스템들과 비교하여 정확도와 재현율로 대변되는 효과성을 측정하였으며, 성능 실험 결과 정확도는 85%, 재현율은 68%로 나타났다.

Fast Super-Resolution Algorithm Based on Dictionary Size Reduction Using k-Means Clustering

  • Jeong, Shin-Cheol;Song, Byung-Cheol
    • ETRI Journal
    • /
    • 제32권4호
    • /
    • pp.596-602
    • /
    • 2010
  • This paper proposes a computationally efficient learning-based super-resolution algorithm using k-means clustering. Conventional learning-based super-resolution requires a huge dictionary for reliable performance, which brings about a tremendous memory cost as well as a burdensome matching computation. In order to overcome this problem, the proposed algorithm significantly reduces the size of the trained dictionary by properly clustering similar patches at the learning phase. Experimental results show that the proposed algorithm provides superior visual quality to the conventional algorithms, while needing much less computational complexity.

동적 공정계획에서의 기계선정을 위한 다목적 유전자 알고리즘 (Multi-Objective Genetic Algorithm for Machine Selection in Dynamic Process Planning)

  • 최회련;김재관;이홍철;노형민
    • 한국정밀공학회지
    • /
    • 제24권4호
    • /
    • pp.84-92
    • /
    • 2007
  • Dynamic process planning requires not only more flexible capabilities of a CAPP system but also higher utility of the generated process plans. In order to meet the requirements, this paper develops an algorithm that can select machines for the machining operations by calculating the machine loads. The developed algorithm is based on the multi-objective genetic algorithm that gives rise to a set of optimal solutions (in general, known as the Pareto-optimal solutions). The objective is to satisfy both the minimization number of part movements and the maximization of machine utilization. The algorithm is characterized by a new and efficient method for nondominated sorting through K-means algorithm, which can speed up the running time, as well as a method of two stages for genetic operations, which can maintain a diverse set of solutions. The performance of the algorithm is evaluated by comparing with another multiple objective genetic algorithm, called NSGA-II and branch and bound algorithm.

The Design of Fuzzy Controller by Means of Genetic Optimization and Estimation Algorithms

  • Oh, Sung-Kwun;Rho, Seok-Beom
    • KIEE International Transaction on Systems and Control
    • /
    • 제12D권1호
    • /
    • pp.17-26
    • /
    • 2002
  • In this paper, a new design methodology of the fuzzy controller is presented. The performance of the fuzzy controller is sensitive to the variety of scaling factors. The design procedure is based on evolutionary computing (more specifically, a genetic algorithm) and estimation algorithm to adjust and estimate scaling factors respectively. The tuning of the soiling factors of the fuzzy controller is essential to the entire optimization process. And then we estimate scaling factors of the fuzzy controller by means of two types of estimation algorithms such as HCM (Hard C-Means) and Neuro-Fuzzy model[7]. The validity and effectiveness of the proposed estimation algorithm for the fuzzy controller are demonstrated by the inverted pendulum system.

  • PDF