• Title/Summary/Keyword: clustering problem

Search Result 709, Processing Time 0.029 seconds

A Study on an Extended Fuzzy Cluster Analysis (확장된 Fuzzy 집락분석방법에 관한 연구)

  • Im Dae-Heug
    • Management & Information Systems Review
    • /
    • v.9
    • /
    • pp.25-39
    • /
    • 2002
  • We consider the Fuzzy clustering which is devised for partitioning a set of objects into a certain number of groups by assigning the membership probabilities to each object. The researches carried out in this field before show that the Fuzzy clustering concept is involved so much that for a certain set of data, the main purpose of the clustering cannot be attained as desired. Thus we propose a new objective function, named as Fuzzy-Entroppy Function in order to satisfy the main motivation of the clustering which is classifying the data clearly. Also we suggest Mean Field Annealing Algorithm as an optimization algorithm rather than the. ISODATA used traditionally in this field since the objective function is changed. We show the Mean Field Annealing Algorithm works pretty well not only for the new objective function but also for the classical Fuzzy objective function by indicating that the local minimum problem resulted from the ISODATA can be improved.

  • PDF

Color Data Clustering Algorithm using Fuzzy Color Model (퍼지컬러 모델을 이용한 컬러 데이터 클러스터링 알고리즘1)

  • Kim, Dae-Won;Lee, Kwang H.
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.119-122
    • /
    • 2002
  • The research Interest of this paper is focused on the efficient clustering task for an arbitrary color data. In order to tackle this problem, we have tiled to model the inherent uncertainty and vagueness of color data using fuzzy color model. By laking a fuzzy approach to color modeling, we could make a soft decision for the vague regions between neighboring colors. The proposed fuzzy color model defined a three dimensional fuzzy color ball and color membership computation method with the two inter-color distance measures. With the fuzzy color model, we developed a new fuzzy clustering algorithm for an efficient partition of color data. Each fuzzy cluster set has a cluster prototype which is represented by fuzzy color centroid.

  • PDF

A practical application of cluster analysis using SPSS

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.6
    • /
    • pp.1207-1212
    • /
    • 2009
  • Basic objective in cluster analysis is to discover natural groupings of items or variables. In general, clustering is conducted based on some similarity (or dissimilarity) matrix or the original input text data. Various measures of similarities (or dissimilarities) between objects (or variables) are developed. We introduce a real application problem of clustering procedure in SPSS when the distance matrix of the objects (or variables) is only given as an input data. It will be very helpful for the cluster analysis of huge data set which leads the size of the proximity matrix greater than 1000, particularly. Syntax command for matrix input data in SPSS for clustering is given with numerical examples.

  • PDF

Improving Real-Time Efficiency of Case Retrieving Process for Case-Based Reasoning

  • Park, Yoon-Joo
    • Asia pacific journal of information systems
    • /
    • v.25 no.4
    • /
    • pp.626-641
    • /
    • 2015
  • Conventional case-based reasoning (CBR) does not perform efficiently for high-volume datasets because of case retrieval time. To overcome this problem, previous research suggested clustering a case base into several small groups and retrieving neighbors within a corresponding group to a target case. However, this approach generally produces less accurate predictive performance than the conventional CBR. This paper proposes a new case-based reasoning method called the clustering-merging CBR (CM-CBR). The CM-CBR method dynamically indexes a search pool to retrieve neighbors considering the distance between a target case and the centroid of a corresponding cluster. This method is applied to three real-life medical datasets. Results show that the proposed CM-CBR method produces similar or better predictive performance than the conventional CBR and clustering-CBR methods in numerous cases with significantly less computational cost.

An Optimization Approach to Data Clustering

  • Kim, Ju-Mi;Olafsson, Sigurdur
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.05a
    • /
    • pp.621-628
    • /
    • 2005
  • Scalability of clustering algorithms is critical issues facing the data mining community. This is particularly true for computationally intense tasks such as data clustering. Random sampling of instances is one possible means of achieving scalability but a pervasive problem with this approach is how to deal with the noise that this introduces in the evaluation of the learning algorithm. This paper develops a new optimization based clustering approach using an algorithms specifically designed for noisy performance. Numerical results illustrate that with this algorithm substantial benefits can be achieved in terms of computational time without sacrificing solution quality.

  • PDF

Polynomial Fuzzy Radial Basis Function Neural Network Classifiers Realized with the Aid of Boundary Area Decision

  • Roh, Seok-Beom;Oh, Sung-Kwun
    • Journal of Electrical Engineering and Technology
    • /
    • v.9 no.6
    • /
    • pp.2098-2106
    • /
    • 2014
  • In the area of clustering, there are numerous approaches to construct clusters in the input space. For regression problem, when forming clusters being a part of the overall model, the relationships between the input space and the output space are essential and have to be taken into consideration. Conditional Fuzzy C-Means (c-FCM) clustering offers an opportunity to analyze the structure in the input space with the mechanism of supervision implied by the distribution of data present in the output space. However, like other clustering methods, c-FCM focuses on the distribution of the data. In this paper, we introduce a new method, which by making use of the ambiguity index focuses on the boundaries of the clusters whose determination is essential to the quality of the ensuing classification procedures. The introduced design is illustrated with the aid of numeric examples that provide a detailed insight into the performance of the fuzzy classifiers and quantify several essentials design aspects.

A Study on the TICC(Time Interval Clustering Control) Algorithm which Using a Timing in MANET (MANET에서 Time Interval Clustering Control 기법에 관한 연구)

  • Kim, Young-Sam;Doo, Kyoung-Min;Kim, Sun-Guk;Lee, Kang-Whan;Chi, Sam-Hyeon
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.629-630
    • /
    • 2008
  • MANET is depended on the property as like variable energy, high degree of mobility, location environments of nodes etc. So, in this paper, we propose an algorithm techniques which is TICC (Time Interval Clustering Control) based on energy value in property of each node for solving cluster problem. It provides improving cluster energy efficiency how can being node manage to order each node's energy level. TICC is clustering method. It has shown that Node's energy efficiency and life time are improved in MANET.

  • PDF

Similarity measure for P2P processing of semantic data (시맨틱웹 데이터의 P2P 처리를 위한 유사도 측정)

  • Kim, Byung Gon;Kim, Youn Hee
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.6 no.4
    • /
    • pp.11-20
    • /
    • 2010
  • Ontology is important role in semantic web to construct and query semantic data. Because of dynamic characteristic of ontology, P2P environment is considered for ontology processing in web environment. For efficient processing of ontology in P2P environment, clustering of peers should be considered. When new peer is added to the network, cluster allocation problem of the new peer is important for system efficiency. For clustering of peers with similar chateristics, similarlity measure method of ontology in added peer with ontologies in other clusters is needed. In this paper, we propose similarity measure techniques of ontologies for clustering of peers. Similarity measure method in this paper considered ontology's strucural characteristics like schema, class, property. Results of experiments show that ontologies of similar topics, class, property can be allocated to the same cluster.

SUPPORT VECTOR MACHINE USING K-MEANS CLUSTERING

  • Lee, S.J.;Park, C.;Jhun, M.;Koo, J.Y.
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.1
    • /
    • pp.175-182
    • /
    • 2007
  • The support vector machine has been successful in many applications because of its flexibility and high accuracy. However, when a training data set is large or imbalanced, the support vector machine may suffer from significant computational problem or loss of accuracy in predicting minority classes. We propose a modified version of the support vector machine using the K-means clustering that exploits the information in class labels during the clustering process. For large data sets, our method can save the computation time by reducing the number of data points without significant loss of accuracy. Moreover, our method can deal with imbalanced data sets effectively by alleviating the influence of dominant class.

A Study on Korean isolated word recognition using LPC cepstrum and clustering (LPC Cepstrum과 집단화를 이용한 한국어 고립단어 인식에 관한 연구)

  • Kim, Jin-Yeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.44-54
    • /
    • 1987
  • In this paper, the problem of LP-model and it's solution by liftering in cepstrum domain are investigated in speaker independent isolated-word recognition. And, clustering technique is discussed for obtaining the reference template. KMA (K-means iteration with average) method, which is transformed from UWA method and K-iteration method, has been suggested and compared with each other for clustering, the result of recognition experiments shows max. $95\%$ recognition rate when rasied-sign lifter and KMA clustering method is applied.

  • PDF