• Title/Summary/Keyword: Data Organizing

Search Result 645, Processing Time 0.021 seconds

Principal Components Self-Organizing Map PC-SOM (주성분 자기조직화 지도 PC-SOM)

  • 허명회
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.321-333
    • /
    • 2003
  • Self-organizing map (SOM), a unsupervised learning neural network, has been developed by T. Kohonen since 1980's. Main application areas were pattern recognition and text retrieval. Because of that, it has not been spread to statisticians until late. Recently, SOM's are frequently drawn in data mining fields. Kohonen's SOM, however, needs improvements to become a statistician's standard tool. First, there should be a good guideline as for the size of map. Second, an enhanced visualization mode is wanted. In this study, principal components self-organizing map (PC-SOM), a modification of Kohonen's SOM, is proposed to meet such needs. PC-SOM performs one-dimensional SOM during the first stage to decompose input units into node weights and residuals. At the second stage, another one-dimensional SOM is applied to the residuals of the first stage. Finally, by putting together two stages, one obtains two-dimensional SOM. Such procedure can be easily expanded to construct three or more dimensional maps. The number of grid lines along the second axis is determined automatically, once that of the first axis is given by the data analyst. Furthermore, PC-SOM provides easily interpretable map axes. Such merits of PC-SOM are demonstrated with well-known Fisher's iris data and a simulated data set.

Application of Self-Organizing Map for the Characteristics Analysis of Rainfall-Storage and TOC Variation in a Lake (호소수의 강우-저류량 및 TOC변동 특성분석을 위한 자기조직화 방법의 적용)

  • Kim, Yong Gu;Jin, Young Hoon;Jung, Woo Cheol;Park, Sung Chun
    • Journal of Korean Society on Water Environment
    • /
    • v.24 no.5
    • /
    • pp.611-617
    • /
    • 2008
  • It is necessary to analysis the data characteristics of discharge and water quality for efficient water resources management, aggressive alternatives to inundation by flood and various water pollution accidents, the basic information to manage water quality in lakes and to make environmental policy. Therefore, the present study applied Self-Organizing Map (SOM) showing excellent performance in classifying patterns with weights estimated by self-organization. The result revealed five patterns and TOC versus rainfall-storage data according to the respective patterns were depicted in two-dimensional plots. The visualization presented better understanding of data distribution pattern. The result in the present study might be expected to contribute to the modeling procedure for data prediction in the future.

Program Development of Integrated Expression Profile Analysis System for DNA Chip Data Analysis (DNA칩 데이터 분석을 위한 유전자발연 통합분석 프로그램의 개발)

  • 양영렬;허철구
    • KSBB Journal
    • /
    • v.16 no.4
    • /
    • pp.381-388
    • /
    • 2001
  • A program for integrated gene expression profile analysis such as hierarchical clustering, K-means, fuzzy c-means, self-organizing map(SOM), principal component analysis(PCA), and singular value decomposition(SVD) was made for DNA chip data anlysis by using Matlab. It also contained the normalization method of gene expression input data. The integrated data anlysis program could be effectively used in DNA chip data analysis and help researchers to get more comprehensive analysis view on gene expression data of their own.

  • PDF

Multiple Plane Area Detection Using Self Organizing Map (자기 조직화 지도를 이용한 다중 평면영역 검출)

  • Kim, Jeong-Hyun;Teng, Zhu;Kang, Dong-Joong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.1
    • /
    • pp.22-30
    • /
    • 2011
  • Plane detection is very important information for mission-critical of robot in 3D environment. A representative method of plane detection is Hough-transformation. Hough-transformation is robust to noise and makes the accurate plane detection possible. But it demands excessive memory and takes too much processing time. Iterative randomized Hough-transformation has been proposed to overcome these shortcomings. This method doesn't vote all data. It votes only one value of the randomly selected data into the Hough parameter space. This value calculated the value of the parameter of the shape that we want to extract. In Hough parameters space, it is possible to detect accurate plane through detection of repetitive maximum value. A common problem in these methods is that it requires too much computational cost and large number of memory space to find the distribution of mixed multiple planes in parameter space. In this paper, we detect multiple planes only via data sampling using Self Organizing Map method. It does not use conventional methods that include transforming to Hough parameter space, voting and repetitive plane extraction. And it improves the reliability of plane detection through division area searching and planarity evaluation. The proposed method is more accurate and faster than the conventional methods which is demonstrated the experiments in various conditions.

Optimal design of Self-Organizing Fuzzy Polynomial Neural Networks with evolutionarily optimized FPN (진화론적으로 최적화된 FPN에 의한 자기구성 퍼지 다항식 뉴럴 네트워크의 최적 설계)

  • Park, Ho-Sung;Oh, Sung-Kwun
    • Proceedings of the KIEE Conference
    • /
    • 2005.05a
    • /
    • pp.12-14
    • /
    • 2005
  • In this paper, we propose a new architecture of Self-Organizing Fuzzy Polynomial Neural Networks(SOFPNN) by means of genetically optimized fuzzy polynomial neuron(FPN) and discuss its comprehensive design methodology involving mechanisms of genetic optimization, especially genetic algorithms(GAs). The conventional SOFPNNs hinges on an extended Group Method of Data Handling(GMDH) and exploits a fixed fuzzy inference type in each FPN of the SOFPNN as well as considers a fixed number of input nodes located in each layer. The design procedure applied in the construction of each layer of a SOFPNN deals with its structural optimization involving the selection of preferred nodes (or FPNs) with specific local characteristics (such as the number of input variables, the order of the polynomial of the consequent part of fuzzy rules, a collection of the specific subset of input variables, and the number of membership function) and addresses specific aspects of parametric optimization. Therefore, the proposed SOFPNN gives rise to a structurally optimized structure and comes with a substantial level of flexibility in comparison to the one we encounter in conventional SOFPNNs. To evaluate the performance of the genetically optimized SOFPNN, the model is experimented with using two time series data(gas furnace and chaotic time series).

  • PDF

Development of a Knowledge Discovery System using Hierarchical Self-Organizing Map and Fuzzy Rule Generation

  • Koo, Taehoon;Rhee, Jongtae
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.431-434
    • /
    • 2001
  • Knowledge discovery in databases(KDD) is the process for extracting valid, novel, potentially useful and understandable knowledge form real data. There are many academic and industrial activities with new technologies and application areas. Particularly, data mining is the core step in the KDD process, consisting of many algorithms to perform clustering, pattern recognition and rule induction functions. The main goal of these algorithms is prediction and description. Prediction means the assessment of unknown variables. Description is concerned with providing understandable results in a compatible format to human users. We introduce an efficient data mining algorithm considering predictive and descriptive capability. Reasonable pattern is derived from real world data by a revised neural network model and a proposed fuzzy rule extraction technique is applied to obtain understandable knowledge. The proposed neural network model is a hierarchical self-organizing system. The rule base is compatible to decision makers perception because the generated fuzzy rule set reflects the human information process. Results from real world application are analyzed to evaluate the system\`s performance.

  • PDF

Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data

  • Do, Jin Hwan;Choi, Dong-Kug
    • Molecules and Cells
    • /
    • v.25 no.2
    • /
    • pp.279-288
    • /
    • 2008
  • The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

A Study of optimized clustering method based on SOM for CRM

  • Jong T. Rhee;Lee, Joon.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.464-469
    • /
    • 2001
  • CRM(Customer Relationship Management : CRM) is an advanced marketing supporting system which analyze customers\` transaction data and classify or target customer groups to effectively increase market share and profit. Many engines were developed to implements the function and those for classification and clustering are considered core ones. In this study, an improved clustering method based on SOM(Self-Organizing Maps : SOM) is proposed. The proposed clustering method finds the optimal number of clusters so that the effectiveness of clustering is increased. It considers all the data types existing in CRM data warehouses. In particular, and adaptive algorithm where the concepts of degeneration and fusion are applied to find optimal number of clusters. The feasibility and efficiency of the proposed method are demonstrated through simulation with simplified data of customers.

  • PDF

A New Self-Organizing Map based on Kernel Concepts (자가 조직화 지도의 커널 공간 해석에 관한 연구)

  • Cheong Sung-Moon;Kim Ki-Bom;Hong Soon-Jwa
    • The KIPS Transactions:PartB
    • /
    • v.13B no.4 s.107
    • /
    • pp.439-448
    • /
    • 2006
  • Previous recognition/clustering algorithms such as Kohonen SOM(Self-Organizing Map), MLP(Multi-Layer Percecptron) and SVM(Support Vector Machine) might not adapt to unexpected input pattern. And it's recognition rate depends highly on the complexity of own training patterns. We could make up for and improve the weak points with lowering complexity of original problem without losing original characteristics. There are so many ways to lower complexity of the problem, and we chose a kernel concepts as an approach to do it. In this paper, using a kernel concepts, original data are mapped to hyper-dimension space which is near infinite dimension. Therefore, transferred data into the hyper-dimension are distributed spasely rather than originally distributed so as to guarantee the rate to be risen. Estimating ratio of recognition is based on a new similarity-probing and learning method that are proposed in this paper. Using CEDAR DB which data is written in cursive letters, 0 to 9, we compare a recognition/clustering performance of kSOM that is proposed in this paper with previous SOM.

Fault Detection and Diagnosis for EVA Production Processes Using AE-SOM (AE-SOM을 이용한 EVA 생산 공정 이상 검출 및 진단)

  • Park, Byeong Eon;Ji, Yumi;Sim, Ye Seul;Lee, Kyu-Hwang;Lee, Ho Kyung
    • Korean Chemical Engineering Research
    • /
    • v.58 no.3
    • /
    • pp.408-415
    • /
    • 2020
  • In this study, the AE-SOM method, which combines auto-encoder and self-organizing map, is used to detect and diagnose faults in EVA production process. Then, the fault propagation pathways are identified using Granger causality test. One year and seven months of operation data were obtained to detect faults of the process, and the process variables of the autoclave reactor are mainly analyzed. In the data pretreatment process, the data are standardized and 200 samples of each grade are randomly chosen to obtain a fault detection model. After that, the best matching unit (BMU) of each grade is confirmed by applying AE-SOM. The faults are determined based on each BMU. When a fault is found, the most causative variable of the fault is identified by using a contribution plot, and the fault propagation pathway is identified by Granger causality test. The prognostic of the two shutdowns is detected, and the fault propagation pathway caused by the faulty variable was analyzed.