• Title/Summary/Keyword: K-Means clustering algorithm

Search Result 548, Processing Time 0.026 seconds

Selection of Optimal Variables for Clustering of Seoul using Genetic Algorithm (유전자 알고리즘을 이용한 서울시 군집화 최적 변수 선정)

  • Kim, Hyung Jin;Jung, Jae Hoon;Lee, Jung Bin;Kim, Sang Min;Heo, Joon
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.175-181
    • /
    • 2014
  • Korean government proposed a new initiative 'government 3.0' with which the administration will open its dataset to the public before requests. City of Seoul is the front runner in disclosure of government data. If we know what kind of attributes are governing factors for any given segmentation, these outcomes can be applied to real world problems of marketing and business strategy, and administrative decision makings. However, with respect to city of Seoul, selection of optimal variables from the open dataset up to several thousands of attributes would require a humongous amount of computation time because it might require a combinatorial optimization while maximizing dissimilarity measures between clusters. In this study, we acquired 718 attribute dataset from Statistics Korea and conducted an analysis to select the most suitable variables, which differentiate Gangnam from other districts, using the Genetic algorithm and Dunn's index. Also, we utilized the Microsoft Azure cloud computing system to speed up the process time. As the result, the optimal 28 variables were finally selected, and the validation result showed that those 28 variables effectively group the Gangnam from other districts using the Ward's minimum variance and K-means algorithm.

A Study on Classification Evaluation Prediction Model by Cluster for Accuracy Measurement of Unsupervised Learning Data (비지도학습 데이터의 정확성 측정을 위한 클러스터별 분류 평가 예측 모델에 대한 연구)

  • Jung, Se Hoon;Kim, Jong Chan;Kim, Cheeyong;You, Kang Soo;Sim, Chun Bo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.7
    • /
    • pp.779-786
    • /
    • 2018
  • In this paper, we are applied a nerve network to allow for the reflection of data learning methods in their overall forms by using cluster data rather than data learning by the stages and then selected a nerve network model and analyzed its variables through learning by the cluster. The CkLR algorithm was proposed to analyze the reaction variables of clustering outcomes through an approach to the initialization of K-means clustering and build a model to assess the prediction rate of clustering and the accuracy rate of prediction in case of new data inputs. The performance evaluation results show that the accuracy rate of test data by the class was over 92%, which was the mean accuracy rate of the entire test data, thus confirming the advantages of a specialized structure found in the proposed learning nerve network by the class.

Analysis of Apartment Power Consumption and Forecast of Power Consumption Based on Deep Learning (공동주택 전력 소비 데이터 분석 및 딥러닝을 사용한 전력 소비 예측)

  • Yoo, Namjo;Lee, Eunae;Chung, Beom Jin;Kim, Dong Sik
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1373-1380
    • /
    • 2019
  • In order to increase energy efficiency, developments of the advanced metering infrastructure (AMI) in the smart grid technology have recently been actively conducted. An essential part of AMI is analyzing power consumption and forecasting consumption patterns. In this paper, we analyze the power consumption and summarized the data errors. Monthly power consumption patterns are also analyzed using the k-means clustering algorithm. Forecasting the consumption pattern by each household is difficult. Therefore, we first classify the data into 100 clusters and then predict the average of the next day as the daily average of the clusters based on the deep neural network. Using practically collected AMI data, we analyzed the data errors and could successfully conducted power forecasting based on a clustering technique.

A Study on the Clustering method for Analysis of Zeus Botnet Attack Types in the Cloud Environment (클라우드 환경에서 제우스 Botnet 공격 유형 분석을 위한 클러스터링 방안 연구)

  • Bae, Won-il;Choi, Suk-June;Kim, Seong-Jin;Kim, Hyeong-Cheon;Kwak, Jin
    • Journal of Internet Computing and Services
    • /
    • v.18 no.1
    • /
    • pp.11-20
    • /
    • 2017
  • Recently, developments in the various fields of cloud computing technology has been utilized. Whereas the demand for cloud computing services is increasing, security threats are also increasing in the cloud computing environments. Especially, in case when the hosts interconnected in the cloud environments are infected and propagated through the attacks by malware. It can have an effect on the resource of other hosts and other security threats such as personal information can be spreaded and data deletion. Therefore, the study of malware analysis to respond these security threats has been proceeded actively. This paper proposes a type of attack clustering method of Zeus botnet using the k-means clustering algorithm for malware analysis that occurs in the cloud environments. By clustering the malicious activity by a type of the Zeus botnet occurred in the cloud environments. it is possible to determine whether it is a malware or not. In the future, it sets a goal of responding to an attack of the new type of Zeus botnet that may occur in the cloud environments.

A clustering method using the Coulomb Energy Network (쿨롱네트워크를 이용한 집락분석)

  • 이석훈;박래현;김응환
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.1
    • /
    • pp.39-50
    • /
    • 1995
  • This article deals with the problem that all the statistical clustering methods do not supply the clustering rule after the analysis. We modify the Coulomb Energy Network model basically developed in physics and suggest one model appropriate for our purpose and show the implementation using an actual data. Finally the method suggested is compared with one of the well known methods, K-means algorithm using Rand C.

  • PDF

Automatic Extraction of Component Inspection Regions from Printed Circuit Board by Image Clustering (영상 클러스터링에 의한 인쇄회로기판의 부품검사영역 자동추출)

  • Kim, Jun-Oh;Park, Tae-Hyoung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.3
    • /
    • pp.472-478
    • /
    • 2012
  • The inspection machine in PCB (printed circuit board) assembly line checks assembly errors by inspecting the images inside of the component inspection region. The component inspection region consists of region of component package and region of soldering. It is necessary to extract the regions automatically for auto-teaching system of the inspection machine. We propose an image segmentation method to extract the component inspection regions automatically from images of PCB. The acquired image is transformed to HSI color model, and then segmented by several regions by clustering method. We develop a modified K-means algorithm to increase the accuracy of extraction. The heuristics generating the initial clusters and merging the final clusters are newly proposed. The vertical and horizontal projection is also developed to distinguish the region of component package and region of soldering. The experimental results are presented to verify the usefulness of the proposed method.

Clustering with Adaptive weighting of Context-aware Linear regression (상황인식기반 선형회귀의 적응적 가중치를 적용한 클러스터링)

  • Lee, Kang-whan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.271-273
    • /
    • 2021
  • 본 논문은 이동노드의 클러스터링내에서 보다 효율적인클러스터링을 제공하고 유지하기위한 딥러닝의 선형회귀적 적응적 보정가중치에 따른 군집적 알고리즘을 제안한다. 대부분의 클러스터링 군집데이터를 처리함에 있어 상호관계에 따른 분류체계가 제공된다. 이러한 경우 이웃한 이동노드중 목적노드와는 연결가능성이 가장높은 이동노드를 클러스터내에서 중계노드로 선택해야 한다. 본 연구에서는 이러한 상황정보를 이해하고 동적이동노드간 속도와 방향속성정보간의 상관관계의 친밀도를 고려한 자율학습기반의 회귀적 모델에서 적응적 가중치에 따른 분류를 제시한다. 본 논문에서는 이러한 상황정보를 이해하고 클러스터링을 유지할 수 있는 자율학습기반의 적응적 가중치에 따른 딥러닝 모델을 제시 한다.

  • PDF

A Study on Nucleus Recognition of Uterine Cervical Pap-Smears using Fuzzy c-Means Clustering Algorithm (퍼지 c-Means 클러스터링 알고리즘을 이용한 자궁 세포진 핵 인식에 관한 연구)

  • Heo, Jung-Min;Kim, Jung-Min;Kim, Kwang-Baek
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.403-407
    • /
    • 2005
  • 자궁 경부 세포진 영상의 핵 영역 분할은 자궁 경부암 자동화 검색 시스템의 가장 어렵고도 중요한 분야로 알려져 있다. 본 논문에서는 자궁 경부 세포진 영상에서 HSI 모델을 이용하여 세포진 핵 영역을 추출한다. 추출된 세포진 핵 영역은 형태학적 정보(morphometric feature)와 명암 정보(densitometric feature), 색상 정보(colorimetric feature), 질감 정보(textural features)를 분석하여 핵의 특징을 추출한다. 또한 Bethesda System에서의 분류 기준에 따라 핵의 분류 기준을 정하고 추출된 핵의 특징들을 퍼지 c-Means 클러스터링 알고리즘에 적용하여 실험한 결과, 제안된 방법이 자궁 세포진 핵 추출과 인식에 있어서 효율적임을 확인하였다.

  • PDF

A Stay Detection Algorithm Using GPS Trajectory and Points of Interest Data

  • Eunchong Koh;Changhoon Lyu;Goya Choi;Kye-Dong Jung;Soonchul Kwon;Chigon Hwang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.176-184
    • /
    • 2023
  • Points of interest (POIs) are widely used in tourism recommendations and to provide information about areas of interest. Currently, situation judgement using POI and GPS data is mainly rule-based. However, this approach has the limitation that inferences can only be made using predefined POI information. In this study, we propose an algorithm that uses POI data, GPS data, and schedule information to calculate the current speed, location, schedule matching, movement trajectory, and POI coverage, and uses machine learning to determine whether to stay or go. Based on the input data, the clustered information is labelled by k-means algorithm as unsupervised learning. This result is trained as the input vector of the SVM model to calculate the probability of moving and staying. Therefore, in this study, we implemented an algorithm that can adjust the schedule using the travel schedule, POI data, and GPS information. The results show that the algorithm does not rely on predefined information, but can make judgements using GPS data and POI data in real time, which is more flexible and reliable than traditional rule-based approaches. Therefore, this study can optimize tourism scheduling. Therefore, the stay detection algorithm using GPS movement trajectories and POIs developed in this study provides important information for tourism schedule planning and is expected to provide much value for tourism services.

Development of Image Process for Crack Identification on Porcelain Insulators (자기애자의 자기부 균열 식별을 위한 이미지 처리기법 개발)

  • Choi, In-Hyuk;Shin, Koo-Yong;An, Ho-Song;Koo, Ja-Bin;Son, Ju-Am;Lim, Dae-Yeon;Oh, Tae-Keun;Yoon, Young-Geun
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.33 no.4
    • /
    • pp.303-309
    • /
    • 2020
  • This study proposes a crack identification algorithm to analyze the surface condition of porcelain insulators and to efficiently visualize cracks. The proposed image processing algorithm for crack identification consists of two primary steps. In the first step, the brightness is eliminated by converting the image to the lab color space. Then, the background is removed by the K-means clustering method. After that, the optimum image treatment is applied using morphological image processing and median filtering to remove unnecessary noise, such as blobs. In the second step, the preprocessed image is converted to grayscale, and any cracks present in the image are identified. Next, the region properties, such as the number of pixels and the ratio of the major to the minor axis, are used to separate the cracks from the noise. Using this image processing algorithm, the precision of crack identification for all the sample images was approximately 80%, and the F1 score was approximately 70. Thus, this method can be helpful for efficient crack monitoring.