• Title/Summary/Keyword: k-mean 클러스터링 알고리즘

Search Result 50, Processing Time 0.026 seconds

Design of Optimized Radial Basis Function Neural Networks Classifier with the Aid of Principal Component Analysis and Linear Discriminant Analysis (주성분 분석법과 선형판별 분석법을 이용한 최적화된 방사형 기저 함수 신경회로망 분류기의 설계)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.735-740
    • /
    • 2012
  • In this paper, we introduce design methodologies of polynomial radial basis function neural network classifier with the aid of Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA). By minimizing the information loss of given data, Feature data is obtained through preprocessing of PCA and LDA and then this data is used as input data of RBFNNs. The hidden layer of RBFNNs is built up by Fuzzy C-Mean(FCM) clustering algorithm instead of receptive fields and linear polynomial function is used as connection weights between hidden and output layer. In order to design optimized classifier, the structural and parametric values such as the number of eigenvectors of PCA and LDA, and fuzzification coefficient of FCM algorithm are optimized by Artificial Bee Colony(ABC) optimization algorithm. The proposed classifier is applied to some machine learning datasets and its result is compared with some other classifiers.

Property-based Hierarchical Clustering of Peers using Mobile Agent for Unstructured P2P Systems (비구조화 P2P 시스템에서 이동에이전트를 이용한 Peer의 속성기반 계층적 클러스터링)

  • Salvo, MichaelAngelG.;Mateo, RomeoMarkA.;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.189-198
    • /
    • 2009
  • Unstructured peer-to-peer systems are most commonly used in today's internet. But file placement is random in these systems and no correlation exists between peers and their contents. There is no guarantee that flooding queries will find the desired data. In this paper, we propose to cluster nodes in unstructured P2P systems using the agglomerative hierarchical clustering algorithm to improve the search method. We compared the delay time of clustering the nodes between our proposed algorithm and the k-means clustering algorithm. We also simulated the delay time of locating data in a network topology and recorded the overhead of the system using our proposed algorithm, k-means clustering, and without clustering. Simulation results show that the delay time of our proposed algorithm is shorter compared to other methods and resource overhead is also reduced.

  • PDF

A Study on Customer rating using RFM and K-Means (RFM 기법과 K-Means 알고리즘을 이용한 고객 분류)

  • Ji, Hyunjung;Shin, Gyeongil;Shin, Dongil;Shin, Dongkyoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.803-806
    • /
    • 2017
  • 고객의 행동을 분석하기 위한 RFM(Recency, Frequency, Monetary)은 마케팅 분양에서 널리 쓰이고 있는 시작분석기법이다. 최근 축적되는 데이터가 많아지면서 이를 활용하기 위해 기계학습에 대한 관심이 증가하였다. 따라서 RFM 기법과 다양한 알고리즘을 결합하여 데이터를 분석하고자 하는 시도가 이루어지고 있다. 본 논문에서는 RFM 기법과 대표적인 클러스터링 알고리즘인 k-means를 통하여 고객을 등급화 하는 방법에 대해 실험하였다. 기존의 실험에서는 k값을 8 혹은 9로 지정하는 사례가 많았다. 그러나 본 실험에서는 내부평가방법을 통해 데이터 셋에 대한 최적의 k값을 구해보았고, 실험 결과 사용한 4개의 데이터 셋에서 3이라는 동일한 결과가 나왔다.

Driving Characteristics Clustering use TCS Data (고속도로 통행료 수납자료를 이용한 주행특성 클러스터링 기법)

  • Kim, Dong-Keun;Park, Won-Sik;Yang, Young-Kyu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.1025-1028
    • /
    • 2009
  • 고속도로의 다양한 주행특성으로는 과속하는 차량, 휴게소나 기타목적의 이용차량, 운전자의 습관이나 피로도등이 있는데 이에 따라 고속도로 주행시간에 차이가 나타난다. 하지만 현재에는 이러한 특성을 고려하지 않고 통행시간 분류가 되고 있어 정확성과 신뢰성을 보장하지 못하고 있는 실정이다. 이에 본 연구에서는 데이터 분포에 따른 해석을 통하여 TCS데이터의 특성을 고려 할 수 있는 Fuzzy c-means 알고리즘과 단순히 임의의 초기값으로 분류하는 K-means와의 비교를 통해서 주행특성을 고려한 클러스터링 기법이 경우에 따라서 더 효과적이고 신뢰성 있는 분류방법이 될 수 있음을 증명하였다.

Unsupervised Learning Model for Fault Prediction Using Representative Clustering Algorithms (대표적인 클러스터링 알고리즘을 사용한 비감독형 결함 예측 모델)

  • Hong, Euyseok;Park, Mikyeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.2
    • /
    • pp.57-64
    • /
    • 2014
  • Most previous studies of software fault prediction model which determines the fault-proneness of input modules have focused on supervised learning model using training data set. However, Unsupervised learning model is needed in case supervised learning model cannot be applied: either past training data set is not present or even though there exists data set, current project type is changed. Building an unsupervised learning model is extremely difficult that is why only a few studies exist. In this paper, we build unsupervised models using representative clustering algorithms, EM and DBSCAN, that have not been used in prior studies and compare these models with the previous model using K-means algorithm. The results of our study show that the EM model performs slightly better than the K-means model in terms of error rate and these two models significantly outperform the DBSCAN model.

A Dispersion Mean Algorithm based on Similarity Measure for Evaluation of Port Competitiveness (항만 경쟁력 평가를 위한 유사도 기반의 이산형 평균 알고리즘)

  • Chw, Bong-Sung;Lee, Cheol-Yeong
    • Journal of Navigation and Port Research
    • /
    • v.28 no.3
    • /
    • pp.185-191
    • /
    • 2004
  • The mean and Clustering are important methods of data mining, which is now widely applied to various multi-attributes problem However, feature weighting and feature selection are important in those methods bemuse features may differ in importance and such differences need to be considered in data mining with various multiful-attributes problem. In addition, in the event of arithmetic mean, which is inadequate to figure out the most fitted result for structure of evaluation with attributes that there are weighted and ranked. Moreover, it is hard to catch hold of a specific character for assume the form of user's group. In this paper. we propose a dispersion mean algorithm for evaluation of similarity measure based on the geometrical figure. In addition, it is applied to mean classified by user's group. One of the key issues to be considered in evaluation of the similarity measure is how to achieve objectiveness that it is not change over an item ranking in evaluation process.

Design of RBFNN-Based Pattern Classifier for the Classification of Precipitation/Non-Precipitation Cases (강수/비강수 사례 분류를 위한 RBFNN 기반 패턴분류기 설계)

  • Choi, Woo-Yong;Oh, Sung-Kwun;Kim, Hyun-Ki
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.586-591
    • /
    • 2014
  • In this study, we introduce Radial Basis Function Neural Networks(RBFNNs) classifier using Artificial Bee Colony(ABC) algorithm in order to classify between precipitation event and non-precipitation event from given radar data. Input information data is rebuilt up through feature analysis of meteorological radar data used in Korea Meteorological Administration. In the condition phase of the proposed classifier, the values of fitness are obtained by using Fuzzy C-Mean clustering method, and the coefficients of polynomial function used in the conclusion phase are estimated by least square method. In the aggregation phase, the final output is obtained by using fuzzy inference method. The performance results of the proposed classifier are compared and analyzed by considering both QC(Quality control) data and CZ(corrected reflectivity) data being used in Korea Meteorological Administration.

A Study on the Improvement of Fault Detection Capability for Fault Indicator using Fuzzy Clustering and Neural Network (퍼지클러스터링 기법과 신경회로망을 이용한 고장표시기의 고장검출 능력 개선에 관한 연구)

  • Hong, Dae-Seung;Yim, Hwa-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.374-379
    • /
    • 2007
  • This paper focuses on the improvement of fault detection algorithm in FRTU(feeder remote terminal unit) on the feeder of distribution power system. FRTU is applied to fault detection schemes for phase fault and ground fault. Especially, cold load pickup and inrush restraint functions distinguish the fault current from the normal load current. FRTU shows FI(Fault Indicator) when the fault current is over pickup value or inrush current. STFT(Short Time Fourier Transform) analysis provides the frequency and time Information. FCM(Fuzzy C-Mean clustering) algorithm extracts characteristics of harmonics. The neural network system as a fault detector was trained to distinguish the inruih current from the fault status by a gradient descent method. In this paper, fault detection is improved by using FCM and neural network. The result data were measured in actual 22.9kV distribution power system.

The Algorithm of implementation for genome analysis ecosystems : Mitochondria's case (유전체 생태계 분석을 위한 알고리즘 구현: 미토콘드리아 사례)

  • Choi, Sung-Ja;Cho, Han-Wook
    • Journal of Digital Convergence
    • /
    • v.14 no.4
    • /
    • pp.349-353
    • /
    • 2016
  • The studies on the human environment and ecosystem analysis is being actively researched. In recent years, The service of genome analysis has been offering the customized service to prevent the disease as reading an individual's genome information. The genome information by analyzing technology is being required accurate and fast analyses of ecosystem-dielectrics due to the spread of the disease, the use of genetically modified organism and the influx of exotic. In this paper the algorithm of K-Mean clustering for a new classification system was utilized. It will provide new dielectrics information as quickly and accurately for many biologists.

Station Extension Algorithm Considering Destinations to Solve Illegal Parking of E-Scooters

  • Jeongeun, Song;Yoon-Ah, Song;ZoonKy, Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.2
    • /
    • pp.131-142
    • /
    • 2023
  • In this paper, we propose a new station selection algorithm to solve the illegal parking problem of shared electric scooters and improve the service quality. Recently, as a solution to the urban transportation problem, shared electric scooters are attracting attention as the first and last mile means between public transportation and final destinations. As a result, the shared electric scooter market grew rapidly, problems caused by electric scooters are becoming serious. Therefore, in this study, text data are collected to understand the nature of the problem, and the problems related to shared scooters are viewed from the perspective of pedestrians and users in 'LDA Topic Modeling', and a station extension algorithm is based on this. Some parking lots have already been installed, but the existing parking lot location is different from the actual area of tow. Therefore, in this study, we propose an algorithm that can install stations at high actual tow density using mixed clustering technology using K-means after primary clustering by DBSCAN, reflecting the 'current state of electric scooter tow in Seoul'.