• Title/Summary/Keyword: K-Means clustering algorithm

Search Result 548, Processing Time 0.028 seconds

An Internet of Things (IoT) Service Clustering Method based on K-means Algorithm (K-means 기반 사물인터넷 서비스 분류 기법)

  • Yang, Chanwoo;Jo, Jeonghoon;Lee, Daewon
    • Annual Conference of KIPS
    • /
    • 2017.11a
    • /
    • pp.1326-1328
    • /
    • 2017
  • 4차 산업 혁명을 맞이하여 다양한 사물 인터넷(IoT) 서비스가 폭발적으로 등장하고 있다. 현재의 IoT 서비스는 독립 서비스로 제공되는 상황이지만 향후 IoT 서비스는 기존 IoT 서비스의 활용과 결합을 목표로 개발되고 있다. IoT 서비스 간 결합 시 발생할 수 모듈의 중복성 문제를 해결하고 새로운 IoT 서비스의 이식성을 높이기 위해 본 연구에서는 K-means 알고리즘을 활용하여 IoT 서비스 간 유사도를 고려한 IoT 서비스 분류 알고리즘을 제안한다. 실험 및 분석을 통하여 K=8,9인 경우 37개의 상용 IoT 서비스가 효율적이고 적합하게 클러스터됨을 증명하였다.

Irregular Sound Detection using the K-means Algorithm (K-means 알고리듬을 이용한 비정상 사운드 검출)

  • Lee Jae-yeal;Cho Sang-jin;Chong Ui-pil
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.341-344
    • /
    • 2004
  • 발전소에서 운전 중인 발전 설비의 장비 및 기계의 동작, 감시, 진단은 매우 중요한 일이다. 발전소의 이상 감지를 위해 상태 모니터링이 사용되며, 이상이 발생되었을 때 고장의 원인을 분석하고 적절한 조치를 계획하기 위한 이상 진단 과정을 따르게 된다. 본 논문에서는 산업 현장에서 기기들의 운전시에 발생하는 기기 발생 음을 획득하여 정상/비정상을 판정하기 위한 알고리듬에 대하여 연구하였다. 사운드 감시(Sound Monitoring) 기술은 관측된 신호를 acoustic event로 분류하는 것과 분류된 이벤트를 정상 또는 비정상으로 구분하는 두 가지 과정으로 진행할 수 있다. 기존의 기술들은 주파수 분석과 패턴 인식의 방법으로 간단하게 적용되어 왔으며, 본 논문에서는 K-means clustering 알고리듬을 이용하여 사운드를 acoustic event로 분류하고 분류된 사운드를 정상 또는 비정상으로 구분하는 알고리듬을 개발하였다.

  • PDF

Design of Partial Discharge Pattern Classifier of Softmax Neural Networks Based on K-means Clustering : Comparative Studies and Analysis of Classifier Architecture (K-means 클러스터링 기반 소프트맥스 신경회로망 부분방전 패턴분류의 설계 : 분류기 구조의 비교연구 및 해석)

  • Jeong, Byeong-Jin;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.1
    • /
    • pp.114-123
    • /
    • 2018
  • This paper concerns a design and learning method of softmax function neural networks based on K-means clustering. The partial discharge data Information is preliminarily processed through simulation using an Epoxy Mica Coupling sensor and an internal Phase Resolved Partial Discharge Analysis algorithm. The obtained information is processed according to the characteristics of the pattern using a Motor Insulation Monitoring System program. At this time, the processed data are total 4 types that void discharge, corona discharge, surface discharge and slot discharge. The partial discharge data with high dimensional input variables are secondarily processed by principal component analysis method and reduced with keeping the characteristics of pattern as low dimensional input variables. And therefore, the pattern classifier processing speed exhibits improved effects. In addition, in the process of extracting the partial discharge data through the MIMS program, the magnitude of amplitude is divided into the maximum value and the average value, and two pattern characteristics are set and compared and analyzed. In the first half of the proposed partial discharge pattern classifier, the input and hidden layers are classified by using the K-means clustering method and the output of the hidden layer is obtained. In the latter part, the cross entropy error function is used for parameter learning between the hidden layer and the output layer. The final output layer is output as a normalized probability value between 0 and 1 using the softmax function. The advantage of using the softmax function is that it allows access and application of multiple class problems and stochastic interpretation. First of all, there is an advantage that one output value affects the remaining output value and its accompanying learning is accelerated. Also, to solve the overfitting problem, L2-normalization is applied. To prove the superiority of the proposed pattern classifier, we compare and analyze the classification rate with conventional radial basis function neural networks.

Color-Texture Image Watermarking Algorithm Based on Texture Analysis (텍스처 분석 기반 칼라 텍스처 이미지 워터마킹 알고리즘)

  • Kang, Myeongsu;Nguyen, Truc Kim Thi;Nguyen, Dinh Van;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.4
    • /
    • pp.35-43
    • /
    • 2013
  • As texture images have become prevalent throughout a variety of industrial applications, copyright protection of these images has become important issues. For this reason, this paper proposes a color-texture image watermarking algorithm utilizing texture properties inherent in the image. The proposed algorithm selects suitable blocks to embed a watermark using the energy and homogeneity properties of the grey level co-occurrence matrices as inputs for the fuzzy c-means clustering algorithm. To embed the watermark, we first perform a discrete wavelet transform (DWT) on the selected blocks and choose one of DWT subbands. Then, we embed the watermark into discrete cosine transformed blocks with a gain factor. In this study, we also explore the effects of the DWT subbands and gain factors with respect to the imperceptibility and robustness against various watermarking attacks. Experimental results show that the proposed algorithm achieves higher peak signal-to-noise ratio values (47.66 dB to 48.04 dB) and lower M-SVD values (8.84 to 15.6) when we embedded a watermark into the HH band with a gain factor of 42, which means the proposed algorithm is good enough in terms of imperceptibility. In addition, the proposed algorithm guarantees robustness against various image processing attacks, such as noise addition, filtering, cropping, and JPEG compression yielding higher normalized correlation values (0.7193 to 1).

Study on News Video Character Extraction and Recognition (뉴스 비디오 자막 추출 및 인식 기법에 관한 연구)

  • 김종열;김성섭;문영식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.1
    • /
    • pp.10-19
    • /
    • 2003
  • Caption information in news videos can be useful for video indexing and retrieval since it usually suggests or implies the contents of the video very well. In this paper, a new algorithm for extracting and recognizing characters from news video is proposed, without a priori knowledge such as font type, color, size of character. In the process of text region extraction, in order to improve the recognition rate for videos with complex background at low resolution, continuous frames with identical text regions are automatically detected to compose an average frame. The image of the averaged frame is projected to horizontal and vertical direction, and we apply region filling to remove backgrounds to produce the character. Then, K-means color clustering is applied to remove remaining backgrounds to produce the final text image. In the process of character recognition, simple features such as white run and zero-one transition from the center, are extracted from unknown characters. These feature are compared with the pre-composed character feature set to recognize the characters. Experimental results tested on various news videos show that the proposed method is superior in terms of caption extraction ability and character recognition rate.

A Case Study: Unsupervised Approach for Tourist Profile Analysis by K-means Clustering in Turkey

  • Yildirim, Mustafa Eren;Kaya, Murat;FurkanInce, Ibrahim
    • Journal of Internet Computing and Services
    • /
    • v.23 no.1
    • /
    • pp.11-17
    • /
    • 2022
  • Data mining is the task of accessing useful information from a large capacity of data. It can also be referred to as searching for correlations that can provide clues about the future in large data warehouses by using computer algorithms. It has been used in the tourism field for marketing, analysis, and business improvement purposes. This study aims to analyze the tourist profile in Turkey through data mining methods. The reason relies behind the selection of Turkey is the fact that Turkey welcomes millions of tourist every year which can be a role model for other touristic countries. In this study, an anonymous and large-scale data set was used under the law on the protection of personal data. The dataset was taken from a leading tourism company that is still active in Turkey. By using the k-means clustering algorithm on this data, key parameters of profiles were obtained and people were clustered into groups according to their characteristics. According to the outcomes, distinguishing characteristics are gathered under three main titles. These are the age of the tourists, the frequency of their vacations and the period between the reservation and the vacation itself. The results obtained show that the frequency of tourist vacations, the time between bookings and vacations, and age are the most important and characteristic parameters for a tourist's profile. Finally, planning future investments, events and campaign packages can make tourism companies more competitive and improve quality of service. For both businesses and tourists, it is advantageous to prepare individual events and offers for the three major groups of tourists.

The Development of Dynamic Forecasting Model for Short Term Power Demand using Radial Basis Function Network (Radial Basis 함수를 이용한 동적 - 단기 전력수요예측 모형의 개발)

  • Min, Joon-Young;Cho, Hyung-Ki
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.7
    • /
    • pp.1749-1758
    • /
    • 1997
  • This paper suggests the development of dynamic forecasting model for short-term power demand based on Radial Basis Function Network and Pal's GLVQ algorithm. Radial Basis Function methods are often compared with the backpropagation training, feed-forward network, which is the most widely used neural network paradigm. The Radial Basis Function Network is a single hidden layer feed-forward neural network. Each node of the hidden layer has a parameter vector called center. This center is determined by clustering algorithm. Theatments of classical approached to clustering methods include theories by Hartigan(K-means algorithm), Kohonen(Self Organized Feature Maps %3A SOFM and Learning Vector Quantization %3A LVQ model), Carpenter and Grossberg(ART-2 model). In this model, the first approach organizes the load pattern into two clusters by Pal's GLVQ clustering algorithm. The reason of using GLVQ algorithm in this model is that GLVQ algorithm can classify the patterns better than other algorithms. And the second approach forecasts hourly load patterns by radial basis function network which has been constructed two hidden nodes. These nodes are determined from the cluster centers of the GLVQ in first step. This model was applied to forecast the hourly loads on Mar. $4^{th},\;Jun.\;4^{th},\;Jul.\;4^{th},\;Sep.\;4^{th},\;Nov.\;4^{th},$ 1995, after having trained the data for the days from Mar. $1^{th}\;to\;3^{th},\;from\;Jun.\;1^{th}\;to\;3^{th},\;from\;Jul.\;1^{th}\;to\;3^{th},\;from\;Sep.\;1^{th}\;to\;3^{th},\;and\;from\;Nov.\;1^{th}\;to\;3^{th},$ 1995, respectively. In the experiments, the average absolute errors of one-hour ahead forecasts on utility actual data are shown to be 1.3795%.

  • PDF

The Development of the Vehicles Information Detector (Al 기법을 이용한 차량 정보 수집 장비 개발)

  • Moon, Hak-Yong;Ryu, Seung-Ki;Kim, Young-Chun;Byeon, Sang-Cheol;Choi, Do-Hyuk
    • Proceedings of the KIEE Conference
    • /
    • 2002.07b
    • /
    • pp.1283-1285
    • /
    • 2002
  • This study is developed vehicle information detector using loop and piezo sensors. This study would analyze the over all problems concerning our road conditions, environmental matters and unique features of our traffic matters; moreover, with these it would develope the hardware, software, car classification algorithm applied by artificial intelligence and traffic monitoring program which can be easily fixed. This can be divided into traffic detecting algorithm and car classification algorithm. Especially, we have developed the car classification algorithm used by C-means Fuzzy Clustering method.

  • PDF

Optimization of Long-term Generator Maintenance Scheduling considering Network Congestion and Equivalent Operating Hours (송전제약과 등가운전시간을 고려한 장기 예방정비계획 최적화에 관한 연구)

  • Shin, Hansol;Kim, Hyoungtae;Lee, Sungwoo;Kim, Wook
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.2
    • /
    • pp.305-314
    • /
    • 2017
  • Most of the existing researches on systemwide optimization of generator maintenance scheduling do not consider the equivalent operating hours(EOHs) mainly due to the difficulties of calculating the EOHs of the CCGTs in the large scale system. In order to estimate the EOHs not only the operating hours but also the number of start-up/shutdown during the planning period should be estimated, which requires the mathematical model to incorporate the economic dispatch model and unit commitment model. The model is inherently modelled as a large scale mixed-integer nonlinear programming problem and the computation time increases exponentially and intractable as the system size grows. To make the problem tractable, this paper proposes an EOH calculation based on demand grouping by K-means clustering algorithm. Network congestion is also considered in order to improve the accuracy of EOH calculation. This proposed method is applied to the actual Korean electricity market and compared to other existing methods.

Improvement of the PFCM(Possibilistic Fuzzy C-Means) Clustering Method (PFCM 클러스터링 기법의 개선)

  • Heo, Gyeong-Yong;Choe, Se-Woon;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.1
    • /
    • pp.177-185
    • /
    • 2009
  • Cluster analysis or clustering is a kind of unsupervised learning method in which a set of data points is divided into a given number of homogeneous groups. Fuzzy clustering method, one of the most popular clustering method, allows a point to belong to all the clusters with different degrees, so produces more intuitive and natural clusters than hard clustering method does. Even more some of fuzzy clustering variants have noise-immunity. In this paper, we improved the Possibilistic Fuzzy C-Means (PFCM), which generates a membership matrix as well as a typicality matrix, using Gath-Geva (GG) method. The proposed method has a focus on the boundaries of clusters, which is different from most of the other methods having a focus on the centers of clusters. The generated membership values are suitable for the classification-type applications. As the typicality values generated from the algorithm have a similar distribution with the values of density function of Gaussian distribution, it is useful for Gaussian-type density estimation. Even more GG method can handle the clusters having different numbers of data points, which the other well-known method by Gustafson and Kessel can not. All of these points are obvious in the experimental results.