• Title/Summary/Keyword: clustering-based pattern recognition

Search Result 68, Processing Time 0.025 seconds

Enhanced FCM-based Hybrid Network for Pattern Classification (패턴 분류를 위한 개선된 FCM 기반 하이브리드 네트워크)

  • Kim, Kwang-Baek
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.9
    • /
    • pp.1905-1912
    • /
    • 2009
  • Clustering results based on the FCM algorithm sometimes produces undesirable clustering result through data distribution in the clustered space because data is classified by comparison with membership degree which is calculated by the Euclidean distance between input vectors and clusters. Symmetrical measurement of clusters and fuzzy theory are applied to the classification to tackle this problem. The enhanced FCM algorithm has a low impact with the variation of changing distance about each cluster, middle of cluster and cluster formation. Improved hybrid network of applying FCM algorithm is proposed to classify patterns effectively. The proposed enhanced FCM algorithm is applied to the learning structure between input and middle layers, and normalized delta learning rule is applied in learning stage between middle and output layers in the hybrid network. The proposed algorithms compared with FCM-based RBF network using Max_Min neural network, FMC-based RBF network and HCM-based RBF network to evaluate learning and recognition performances in the two-dimensional coordinated data.

An Efficient Clustering Algorithm based on Heuristic Evolution (휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘)

  • Ryu, Joung-Woo;Kang, Myung-Ku;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.80-90
    • /
    • 2002
  • Clustering is a useful technique for grouping data points such that points within a single group/cluster have similar characteristics. Many clustering algorithms have been developed and used in engineering applications including pattern recognition and image processing etc. Recently, it has drawn increasing attention as one of important techniques in data mining. However, clustering algorithms such as K-means and Fuzzy C-means suffer from difficulties. Those are the needs to determine the number of clusters apriori and the clustering results depending on the initial set of clusters which fails to gain desirable results. In this paper, we propose a new clustering algorithm, which solves mentioned problems. In our method we use evolutionary algorithm to solve the local optima problem that clustering converges to an undesirable state starting with an inappropriate set of clusters. We also adopt a new measure that represents how well data are clustered. The measure is determined in terms of both intra-cluster dispersion and inter-cluster separability. Using the measure, in our method the number of clusters is automatically determined as the result of optimization process. And also, we combine heuristic that is problem-specific knowledge with a evolutionary algorithm to speed evolutionary algorithm search. We have experimented our algorithm with several sets of multi-dimensional data and it has been shown that one algorithm outperforms the existing algorithms.

An Automatic Pattern Recognition Algorithm for Identifying the Spatio-temporal Congestion Evolution Patterns in Freeway Historic Data (고속도로 이력데이터에 포함된 정체 시공간 전개 패턴 자동인식 알고리즘 개발)

  • Park, Eun Mi;Oh, Hyun Sun
    • Journal of Korean Society of Transportation
    • /
    • v.32 no.5
    • /
    • pp.522-530
    • /
    • 2014
  • Spatio-temporal congestion evolution pattern can be reproduced using the VDS(Vehicle Detection System) historic speed dataset in the TMC(Traffic Management Center)s. Such dataset provides a pool of spatio-temporally experienced traffic conditions. Traffic flow pattern is known as spatio-temporally recurred, and even non-recurrent congestion caused by incidents has patterns according to the incident conditions. These imply that the information should be useful for traffic prediction and traffic management. Traffic flow predictions are generally performed using black-box approaches such as neural network, genetic algorithm, and etc. Black-box approaches are not designed to provide an explanation of their modeling and reasoning process and not to estimate the benefits and the risks of the implementation of such a solution. TMCs are reluctant to employ the black-box approaches even though there are numerous valuable articles. This research proposes a more readily understandable and intuitively appealing data-driven approach and developes an algorithm for identifying congestion patterns for recurrent and non-recurrent congestion management and information provision.

Word Separation in Handwritten Legal Amounts on Bank Check by Measuring Gap Distance Between Connected Components (연결 성분 간 간격 측정에 의한 필기체 수표 금액 문장에서의 단어 추출)

  • Kim, In-Cheol
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.1
    • /
    • pp.57-62
    • /
    • 2004
  • We have proposed an efficient method of word separation in a handwritten legal amount on bank check based on the spatial gaps between the connected components. The previous gap measures all suffer from the inherent problem of underestimation or overestimation that causes a deterioration in separation performance. In order to alleviate such burden, we have developed a modified version of each distance measure. Also, 4 class clustering based method of integrating three different types of distance measures has been proposed to compensate effectively the errors in each measure, whereby further improvement in performance of word separation is expected. Through a series of word separation experiments, we found that the modified distance measures show a better performance with over 2 - 3% of the word separation rate than their corresponding original distance measures. In addition, the proposed combining method based on 4-class clustering achieved further improvement by effectively reducing the errors common to two of three distance measures as well as the individual errors.

Rapid discrimination of commercial strawberry cultivars using Fourier transform infrared spectroscopy data combined by multivariate analysis

  • Kim, Suk Weon;Min, Sung Ran;Kim, Jonghyun;Park, Sang Kyu;Kim, Tae Il;Liu, Jang R.
    • Plant Biotechnology Reports
    • /
    • v.3 no.1
    • /
    • pp.87-93
    • /
    • 2009
  • To determine whether pattern recognition based on metabolite fingerprinting for whole cell extracts can be used to discriminate cultivars metabolically, leaves and fruits of five commercial strawberry cultivars were subjected to Fourier transform infrared (FT-IR) spectroscopy. FT-IR spectral data from leaves were analyzed by principal component analysis (PCA) and Fisher's linear discriminant function analysis. The dendrogram based on hierarchical clustering analysis of these spectral data separated the five commercial cultivars into two major groups with originality. The first group consisted of Korean cultivars including 'Maehyang', 'Seolhyang', and 'Gumhyang', whereas in the second group, 'Ryukbo' clustered with 'Janghee', both Japanese cultivars. The results from analysis of fruits were the same as of leaves. We therefore conclude that the hierarchical dendrogram based on PCA of FT-IR data from leaves represents the most probable chemotaxonomical relationship between cultivars, enabling discrimination of cultivars in a rapid and simple manner.

Prediction of ship power based on variation in deep feed-forward neural network

  • Lee, June-Beom;Roh, Myung-Il;Kim, Ki-Su
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.13 no.1
    • /
    • pp.641-649
    • /
    • 2021
  • Fuel oil consumption (FOC) must be minimized to determine the economic route of a ship; hence, the ship power must be predicted prior to route planning. For this purpose, a numerical method using test results of a model has been widely used. However, predicting ship power using this method is challenging owing to the uncertainty of the model test. An onboard test should be conducted to solve this problem; however, it requires considerable resources and time. Therefore, in this study, a deep feed-forward neural network (DFN) is used to predict ship power using deep learning methods that involve data pattern recognition. To use data in the DFN, the input data and a label (output of prediction) should be configured. In this study, the input data are configured using ocean environmental data (wave height, wave period, wave direction, wind speed, wind direction, and sea surface temperature) and the ship's operational data (draft, speed, and heading). The ship power is selected as the label. In addition, various treatments have been used to improve the prediction accuracy. First, ocean environmental data related to wind and waves are preprocessed using values relative to the ship's velocity. Second, the structure of the DFN is changed based on the characteristics of the input data. Third, the prediction accuracy is analyzed using a combination comprising five hyperparameters (number of hidden layers, number of hidden nodes, learning rate, dropout, and gradient optimizer). Finally, k-means clustering is performed to analyze the effect of the sea state and ship operational status by categorizing it into several models. The performances of various prediction models are compared and analyzed using the DFN in this study.

Adaptive Data Mining Model using Fuzzy Performance Measures (퍼지 성능 측정자를 이용한 적응 데이터 마이닝 모델)

  • Rhee, Hyun-Sook
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.541-546
    • /
    • 2006
  • Data Mining is the process of finding hidden patterns inside a large data set. Cluster analysis has been used as a popular technique for data mining. It is a fundamental process of data analysis and it has been Playing an important role in solving many problems in pattern recognition and image processing. If fuzzy cluster analysis is to make a significant contribution to engineering applications, much more attention must be paid to fundamental decision on the number of clusters in data. It is related to cluster validity problem which is how well it has identified the structure that Is present in the data. In this paper, we design an adaptive data mining model using fuzzy performance measures. It discovers clusters through an unsupervised neural network model based on a fuzzy objective function and evaluates clustering results by a fuzzy performance measure. We also present the experimental results on newsgroup data. They show that the proposed model can be used as a document classifier.

An Optimal Cluster Analysis Method with Fuzzy Performance Measures (퍼지 성능 측정자를 결합한 최적 클러스터 분석방법)

  • 이현숙;오경환
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.6 no.3
    • /
    • pp.81-88
    • /
    • 1996
  • Cluster analysis is based on partitioning a collection of data points into a number of clusters, where the data points in side a cluster have a certain degree of similarity and it is a fundamental process of data analysis. So, it has been playing an important role in solving many problems in pattern recognition and image processing. For these many clustering algorithms depending on distance criteria have been developed and fuzzy set theory has been introduced to reflect the description of real data, where boundaries might be fuzzy. If fuzzy cluster analysis is tomake a significant contribution to engineering applications, much more attention must be paid to fundamental questions of cluster validity problem which is how well it has identified the structure that is present in the data. Several validity functionals such as partition coefficient, claasification entropy and proportion exponent, have been used for measuring validity mathematically. But the issue of cluster validity involves complex aspects, it is difficult to measure it with one measuring function as the conventional study. In this paper, we propose four performance indices and the way to measure the quality of clustering formed by given learning strategy.

  • PDF

Implementation of Elbow Method to improve the Gases Classification Performance based on the RBFN-NSG Algorithm

  • Jeon, Jin-Young;Choi, Jang-Sik;Byun, Hyung-Gi
    • Journal of Sensor Science and Technology
    • /
    • v.25 no.6
    • /
    • pp.431-434
    • /
    • 2016
  • Currently, the radial basis function network (RBFN) and various other neural networks are employed to classify gases using chemical sensors arrays, and their performance is steadily improving. In particular, the identification performance of the RBFN algorithm is being improved by optimizing parameters such as the center, width, and weight, and improved algorithms such as the radial basis function network-stochastic gradient (RBFN-SG) and radial basis function network-normalized stochastic gradient (RBFN-NSG) have been announced. In this study, we optimized the number of centers, which is one of the parameters of the RBFN-NSG algorithm, and observed the change in the identification performance. For the experiment, repeated measurement data of 8 samples were used, and the elbow method was applied to determine the optimal number of centers for each sample of input data. The experiment was carried out in two cases(the only one center per sample and the optimal number of centers obtained by elbow method), and the experimental results were compared using the mean square error (MSE). From the results of the experiments, we observed that the case having an optimal number of centers, obtained using the elbow method, showed a better identification performance than that without any optimization.

Identification of failure mechanisms for CFRP-confined circular concrete-filled steel tubular columns through acoustic emission signals

  • Li, Dongsheng;Du, Fangzhu;Chen, Zhi;Wang, Yanlei
    • Smart Structures and Systems
    • /
    • v.18 no.3
    • /
    • pp.525-540
    • /
    • 2016
  • The CFRP-confined circular concrete-filled steel tubular column is composed of concrete, steel, and CFRP. Its failure mechanics are complex. The most important difficulties are lack of an available method to establish a relationship between a specific damage mechanism and its acoustic emission (AE) characteristic parameter. In this study, AE technique was used to monitor the evolution of damage in CFRP-confined circular concrete-filled steel tubular columns. A fuzzy c-means method was developed to determine the relationship between the AE signal and failure mechanisms. Cluster analysis results indicate that the main AE sources include five types: matrix cracking, debonding, fiber fracture, steel buckling, and concrete crushing. This technology can not only totally separate five types of damage sources, but also make it easier to judge the damage evolution process. Furthermore, typical damage waveforms were analyzed through wavelet analysis based on the cluster results, and the damage modes were determined according to the frequency distribution of AE signals.