• Title/Summary/Keyword: C-Means clustering

Search Result 363, Processing Time 0.024 seconds

Optimization Driven MapReduce Framework for Indexing and Retrieval of Big Data

  • Abdalla, Hemn Barzan;Ahmed, Awder Mohammed;Al Sibahee, Mustafa A.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.1886-1908
    • /
    • 2020
  • With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face a burden in handling them. Additionally, the presence of the imbalance data in big data is a massive concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new indexing algorithm for retrieving big data in the MapReduce framework. In mappers, the data clustering is done based on the Sparse Fuzzy-c-means (Sparse FCM) algorithm. The reducer combines the clusters generated by the mapper and again performs data clustering with the Sparse FCM algorithm. The two-level query matching is performed for determining the requested data. The first level query matching is performed for determining the cluster, and the second level query matching is done for accessing the requested data. The ranking of data is performed using the proposed Monarch chaotic whale optimization algorithm (M-CWOA), which is designed by combining Monarch butterfly optimization (MBO) [22] and chaotic whale optimization algorithm (CWOA) [21]. Here, the Parametric Enabled-Similarity Measure (PESM) is adapted for matching the similarities between two datasets. The proposed M-CWOA outperformed other methods with maximal precision of 0.9237, recall of 0.9371, F1-score of 0.9223, respectively.

Design of FNN architecture based on HCM Clustering Method (HCM 클러스터링 기반 FNN 구조 설계)

  • Park, Ho-Sung;Oh, Sung-Kwun
    • Proceedings of the KIEE Conference
    • /
    • 2002.07d
    • /
    • pp.2821-2823
    • /
    • 2002
  • In this paper we propose the Multi-FNN (Fuzzy-Neural Networks) for optimal identification modeling of complex system. The proposed Multi-FNNs is based on a concept of FNNs and exploit linear inference being treated as generic inference mechanisms. In the networks learning, backpropagation(BP) algorithm of neural networks is used to updata the parameters of the network in order to control of nonlinear process with complexity and uncertainty of data, proposed model use a HCM(Hard C-Means)clustering algorithm which carry out the input-output dat a preprocessing function and Genetic Algorithm which carry out optimization of model The HCM clustering method is utilized to determine the structure of Multi-FNNs. The parameters of Multi-FNN model such as apexes of membership function, learning rates, and momentum coefficients are adjusted using genetic algorithms. An aggregate performance index with a weighting factor is proposed in order to achieve a sound balance between approximation and generalization abilities of the model. NOx emission process data of gas turbine power plant is simulated in order to confirm the efficiency and feasibility of the proposed approach in this paper.

  • PDF

Design of Pattern Classifier for Electrical and Electronic Waste Plastic Devices Using LIBS Spectrometer (LIBS 분광기를 이용한 폐소형가전 플라스틱 패턴 분류기의 설계)

  • Park, Sang-Beom;Bae, Jong-Soo;Oh, Sung-Kwun;Kim, Hyun-Ki
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.6
    • /
    • pp.477-484
    • /
    • 2016
  • Small industrial appliances such as fan, audio, electric rice cooker mostly consist of ABS, PP, PS materials. In colored plastics, it is possible to classify by near infrared(NIR) spectroscopy, while in black plastics, it is very difficult to classify black plastic because of the characteristic of black material that absorbs the light. So the RBFNNs pattern classifier is introduced for sorting electrical and electronic waste plastics through LIBS(Laser Induced Breakdown Spectroscopy) spectrometer. At the preprocessing part, PCA(Principle Component Analysis), as a kind of dimension reduction algorithms, is used to improve processing speed as well as to extract the effective data characteristics. In the condition part, FCM(Fuzzy C-Means) clustering is exploited. In the conclusion part, the coefficients of linear function of being polynomial type are used as connection weights. PSO and 5-fold cross validation are used to improve the reliability of performance as well as to enhance classification rate. The performance of the proposed classifier is described based on both optimization and no optimization.

Optimization of Fuzzy Set Fuzzy Model by Means of Hierarchical Fair Competition-based Genetic Algorithm using UNDX operator (UNDX연산자를 이용한 계층적 공정 경쟁 유전자 알고리즘을 이용한 퍼지집합 퍼지 모델의 최적화)

  • Kim, Gil-Sung;Choi, Jeoung-Nae;Oh, Sung-Kwun
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.204-206
    • /
    • 2007
  • In this study, we introduce the optimization method of fuzzy inference systems that is based on Hierarchical Fair Competition-based Parallel Genetic Algorithms (HFCGA) and information data granulation, The granulation is realized with the aid of the Hard C-means clustering and HFCGA is a kind of multi-populations of Parallel Genetic Algorithms (PGA), and it is used for structure optimization and parameter identification of fuzzy model. It concerns the fuzzy model-related parameters such as the number of input variables to be used, a collection of specific subset of input variables, the number of membership functions, the order of polynomial, and the apexes of the membership function. In the optimization process, two general optimization mechanisms are explored. The structural optimization is realized via HFCGA and HCM method whereas in case of the parametric optimization we proceed with a standard least square method as well as HFCGA method as well. A comparative analysis demonstrates that the proposed algorithm is superior to the conventional methods. Particularly, in parameter identification, we use the UNDX operator which uses multiple parents and generate offsprings around the geographic center off mass of these parents.

  • PDF

Determining the Fuzzifier Values for Interval Type-2 Possibilistic Fuzzy C-means Clustering (Interval Type-2 Possibilistic Fuzzy C-means 클러스터링을 위한 퍼지화 상수 결정 방법)

  • Joo, Won-Hee;Rhee, Frank Chung-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.27 no.2
    • /
    • pp.99-105
    • /
    • 2017
  • Type-2 fuzzy sets are preferred over type-1 sets as they are capable of addressing uncertainty more efficiently. The fuzzifier values play pivotal role in managing these uncertainties; still selecting appropriate value of fuzzifiers has been a tedious task. Generally, based on observation particular value of fuzzifier is chosen from a given range of values. In this paper we have tried to adaptively compute suitable fuzzifier values of interval type-2 possibilistic fuzzy c-means (IT2 PFCM) for a given data. Information is extracted from individual data points using histogram approach and this information is further processed to give us the two fuzzifier values $m_1$, $m_2$. These obtained values are bounded within some upper and lower bounds based on interval type-2 fuzzy sets.

Overall Analysis of Competitiveness of Asian Major Ports Using the Hybrid Mechanism of FCM and AHP (FCM법과 AHP법을 융합한 아시아 주요항만의 경쟁력에 관한 종합적 분석에 관한 연구)

  • Lee, Hong-Girl
    • Journal of Navigation and Port Research
    • /
    • v.27 no.2
    • /
    • pp.185-191
    • /
    • 2003
  • The aim of this research is to overall analyze/classify characteristics of Asian major ports. To achieve this aim, we firstly pointed out critical problems on research methodology and research scope which most of previous research have, from related literature review. In order to overcome those problems, major ports in A냠 were selected by the objective indicators, and both algorithms of AHP(Analytic Hierarchical Process) and FCM(Fuzzy C-Means) that revise weakness in previous clustering method were used. Through these hybrid approach, it were found that only 10 ports of 16 major Asian ports had their own phases in Asian major ports. Those 10 ports were classified into 6 port groups, and also membership degree of each port within the 4 port groups and ranking of each ports seer analyzed. Finally, based on results of these analysis, present status and future direction of Busan port were discussed.

Classification of Music Data using Fuzzy c-Means with Divergence Kernel (분산커널 기반의 퍼지 c-평균을 이용한 음악 데이터의 장르 분류)

  • Park, Dong-Chul
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.3
    • /
    • pp.1-7
    • /
    • 2009
  • An approach for the classification of music genres using a Fuzzy c-Means(FcM) with divergence-based kernel is proposed and presented in this paper. The proposed model utilizes the mean and covariance information of feature vectors extracted from music data and modelled by Gaussian Probability Density Function (GPDF). Furthermore, since the classifier utilizes a kernel method that can convert a complicated nonlinear classification boundary to a simpler linear one, he classifier can improve its classification accuracy over conventional algorithms. Experiments and results on collected music data sets demonstrate hat the proposed classification scheme outperforms conventional algorithms including FcM and SOM 17.73%-21.84% on average in terms of classification accuracy.

Multiobjective Space Search Optimization and Information Granulation in the Design of Fuzzy Radial Basis Function Neural Networks

  • Huang, Wei;Oh, Sung-Kwun;Zhang, Honghao
    • Journal of Electrical Engineering and Technology
    • /
    • v.7 no.4
    • /
    • pp.636-645
    • /
    • 2012
  • This study introduces an information granular-based fuzzy radial basis function neural networks (FRBFNN) based on multiobjective optimization and weighted least square (WLS). An improved multiobjective space search algorithm (IMSSA) is proposed to optimize the FRBFNN. In the design of FRBFNN, the premise part of the rules is constructed with the aid of Fuzzy C-Means (FCM) clustering while the consequent part of the fuzzy rules is developed by using four types of polynomials, namely constant, linear, quadratic, and modified quadratic. Information granulation realized with C-Means clustering helps determine the initial values of the apex parameters of the membership function of the fuzzy neural network. To enhance the flexibility of neural network, we use the WLS learning to estimate the coefficients of the polynomials. In comparison with ordinary least square commonly used in the design of fuzzy radial basis function neural networks, WLS could come with a different type of the local model in each rule when dealing with the FRBFNN. Since the performance of the FRBFNN model is directly affected by some parameters such as e.g., the fuzzification coefficient used in the FCM, the number of rules and the orders of the polynomials present in the consequent parts of the rules, we carry out both structural as well as parametric optimization of the network. The proposed IMSSA that aims at the simultaneous minimization of complexity and the maximization of accuracy is exploited here to optimize the parameters of the model. Experimental results illustrate that the proposed neural network leads to better performance in comparison with some existing neurofuzzy models encountered in the literature.

Optimization of Fuzzy Learning Machine by Using Particle Swarm Optimization (PSO 알고리즘을 이용한 퍼지 Extreme Learning Machine 최적화)

  • Roh, Seok-Beom;Wang, Jihong;Kim, Yong-Soo;Ahn, Tae-Chon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.1
    • /
    • pp.87-92
    • /
    • 2016
  • In this paper, optimization technique such as particle swarm optimization was used to optimize the parameters of fuzzy Extreme Learning Machine. While the learning speed of conventional neural networks is very slow, that of Extreme Learning Machine is very fast. Fuzzy Extreme Learning Machine is composed of the Extreme Learning Machine with very fast learning speed and fuzzy logic which can represent the linguistic information of the field experts. The general sigmoid function is used for the activation function of Extreme Learning Machine. However, the activation function of Fuzzy Extreme Learning Machine is the membership function which is defined in the procedure of fuzzy C-Means clustering algorithm. We optimize the parameters of the membership functions by using optimization technique such as Particle Swarm Optimization. In order to validate the classification capability of the proposed classifier, we make several experiments with the various machine learning datas.

A Study on Optimized Decision Model for Transfer Crane Operation in Container Terminal (컨테이너터미널 트랜스퍼 크레인의 배정 및 이동경로 최적화 모델)

  • Shin, Jeong-Hoon;Yu, Song-Jin;Chang, Myung-Hee
    • Journal of Navigation and Port Research
    • /
    • v.32 no.6
    • /
    • pp.465-471
    • /
    • 2008
  • As the excessive competition between container terminals has been deepening, not only productivity, but also cost economic of the terminals has been raised. With regard to this, the competitiveness of the terminals is limited because of inefficiency operation of transfer crane(T/C) which needs large amount of energy consumption. Therefore, it is possible that the improvement in the T/C operation leads to saving cost for resources and energy as well as increasing the productivity of the terminals. This study provides 'the K-Means Clustering based Optimized Decision Model for Transfer Crane Operation', referring to 'RFID & RTLS based Port Logistics Initiative' of Ministry of Land, Transportation and Maritime Affairs and estimates the efficiency through simulating.