• Title/Summary/Keyword: pattern clustering

Search Result 545, Processing Time 0.026 seconds

Cause-specific Spatial Point Pattern Analysis of Forest Fire in Korea (우리나라 산불 발생의 원인별 공간적 특성 분석)

  • Kwak, Han-Bin;Lee, Woo-Kyun;Lee, Si-Young;Won, Myung-Soo;Koo, Kyo-Sang;Lee, Byung-Doo;Lee, Myung-Bo
    • Journal of Korean Society of Forest Science
    • /
    • v.99 no.3
    • /
    • pp.259-266
    • /
    • 2010
  • Forest fire occurrence in Korea is highly related to human activities and its spatial distribution shows a strong spatial dependency with cluster pattern. In this study, we analyzed spatial distribution pattern of forest fire with point pattern analysis considering spatial dependency. Distributional pattern was derived from Ripley's K-function according to causes and distances. Spatially clustered intensity was found out using Kernel intensity estimation. As a result, forest fires in Korea show clustered pattern, although the degrees of clustering for each cause are different. Furthermore, spatial clustering pattern can be classified into two groups in terms of degrees of clustering and distance. The first group shows the national-wide cluster pattern related to the human activity near forests, such as human-induced accidental fire in mountain and field incineration. Another group shows localized cluster pattern which is clustered within a short distance. It is associated with the smoker fire, arson, accidental by children. The range of localized clustering was 30 km. Beyond of this range, the patterns of forest fire became random distribution gradually. Kernel intensity analysis showed that the latter group, which have localized cluster pattern, was occurred in near Seoul with high densed population.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

Development of a Daily Pattern Clustering Algorithm using Historical Profiles (과거이력자료를 활용한 요일별 패턴분류 알고리즘 개발)

  • Cho, Jun-Han;Kim, Bo-Sung;Kim, Seong-Ho;Kang, Weon-Eui
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.10 no.4
    • /
    • pp.11-23
    • /
    • 2011
  • The objective of this paper is to develop a daily pattern clustering algorithm using historical traffic data that can reliably detect under various traffic flow conditions in urban streets. The developed algorithm in this paper is categorized into two major parts, that is to say a macroscopic and a microscopic points of view. First of all, a macroscopic analysis process deduces a daily peak/non-peak hour and emphasis analysis time zones based on the speed time-series. A microscopic analysis process clusters a daily pattern compared with a similarity between individuals or between individual and group. The name of the developed algorithm in microscopic analysis process is called "Two-step speed clustering (TSC) algorithm". TSC algorithm improves the accuracy of a daily pattern clustering based on the time-series speed variation data. The experiments of the algorithm have been conducted with point detector data, installed at a Ansan city, and verified through comparison with a clustering techniques using SPSS. Our efforts in this study are expected to contribute to developing pattern-based information processing, operations management of daily recurrent congestion, improvement of daily signal optimization based on TOD plans.

EXTRACTING INSIGHTS OF CLASSIFICATION FOR TURING PATTERN WITH FEATURE ENGINEERING

  • OH, SEOYOUNG;LEE, SEUNGGYU
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.24 no.3
    • /
    • pp.321-330
    • /
    • 2020
  • Data classification and clustering is one of the most common applications of the machine learning. In this paper, we aim to provide the insight of the classification for Turing pattern image, which has high nonlinearity, with feature engineering using the machine learning without a multi-layered algorithm. For a given image data X whose fixel values are defined in [-1, 1], X - X3 and ∇X would be more meaningful feature than X to represent the interface and bulk region for a complex pattern image data. Therefore, we use X - X3 and ∇X in the neural network and clustering algorithm to classification. The results validate the feasibility of the proposed approach.

An Adaption of Pattern Sequence-based Electricity Load Forecasting with Match Filtering

  • Chu, Fazheng;Jung, Sung-Hwan
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.5
    • /
    • pp.800-807
    • /
    • 2017
  • The Pattern Sequence-based Forecasting (PSF) is an approach to forecast the behavior of time series based on similar pattern sequences. The innovation of PSF method is to convert the load time series into a label sequence by clustering technique in order to lighten computational burden. However, it brings about a new problem in determining the number of clusters and it is subject to insufficient similar days occasionally. In this paper we proposed an adaption of the PSF method, which introduces a new clustering index to determine the number of clusters and imposes a threshold to solve the problem caused by insufficient similar days. Our experiments showed that the proposed method reduced the mean absolute percentage error (MAPE) about 15%, compared to the PSF method.

The Optimization of Fuzzy Prototype Classifier by using Differential Evolutionary Algorithm (차분 진화 알고리즘을 이용한 Fuzzy Prototype Classifier 최적화)

  • Ahn, Tae-Chon;Roh, Seok-Beom;Kim, Yong Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.2
    • /
    • pp.161-165
    • /
    • 2014
  • In this paper, we proposed the fuzzy prototype pattern classifier. In the proposed classifier, each prototype is defined to describe the related sub-space and the weight value is assigned to the prototype. The weight value assigned to the prototype leads to the change of the boundary surface. In order to define the prototypes, we use Fuzzy C-Means Clustering which is the one of fuzzy clustering methods. In order to optimize the weight values assigned to the prototypes, we use the Differential Evolutionary Algorithm. We use Linear Discriminant Analysis to estimate the coefficients of the polynomial which is the structure of the consequent part of a fuzzy rule. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

XML Document Clustering Based on Sequential Pattern (순차패턴에 기반한 XML 문서 클러스터링)

  • Hwang, Jeong-Hee;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.10D no.7
    • /
    • pp.1093-1102
    • /
    • 2003
  • As the use of internet is growing, the amount of information is increasing rapidly and XML that is a standard of the web data has the property of flexibility of data representation. Therefore electronic document systems based on web, such as EDMS (Electronic Document Management System), ebXML (e-business extensible Markup Language), have been adopting XML as the method for exchange and standard of documents. So research on the method which can manage and search structural XML documents in an effective wav is required. In this paper we propose the clustering method based on structural similarity among the many XML documents, using typical structures extracted from each document by sequential pattern mining in pre-clustering process. The proposed algorithm improves the accuracy of clustering by computing cost considering cluster cohesion and inter-cluster similarity.

A Pattern Clustering Approach to the Rule Acquisition for the Fuzzy controller of a CAMCODER (패턴 clustering에 의한 캠코더 퍼지 제어기의 rule 획득)

  • 장경식;정진영;신충식;신중인;방교윤;김재희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.1
    • /
    • pp.72-78
    • /
    • 1993
  • While the rules for an expert system are obtained through the interviewing with domain experts or by designer's own experience, these are not adequate for fuzzy controllers dealing quantitative control values. In this paper, by considering a state of the controlled system as a pattern, we propose a method to obtain the control rules by a statistical method. Namely, we propose a method to obtain the control rules by a statistical method. Namely, we propose an rule acquisition method that is objective, mechanical, and inductive inference using a cluster-seeking algorithm, or K-means clustering algorithm. To validate this study, we show an example of an IRIS control in a CAMCODER and analyse the rules acquired from 98 sample patterns consisting of 45 features.

  • PDF

Linear Discriminant Clustering in Pattern Recognition

  • Sun, Zhaojia;Choi, Mi-Seon;Kim, Young-Kuk
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.717-718
    • /
    • 2008
  • Fisher Linear Discriminant(FLD) is a sample and intuitive linear feature extraction method in pattern recognition. But in some special cases, such as un-separable case, one class data dispersed into several clustering case, FLD doesn't work well. In this paper, a new discriminant named K-means Fisher Linear Discriminant, which combines FLD with K-means clustering is proposed. It could deal with this case efficiently, not only possess FLD's global-view merit, but also K-means' local-view property. Finally, the simulation results also demonstrate its advantage against K-means and FLD individually.

  • PDF

Use of Factor Analyzer Normal Mixture Model with Mean Pattern Modeling on Clustering Genes

  • Kim Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.113-123
    • /
    • 2006
  • Normal mixture model(NMM) frequently used to cluster genes on microarray gene expression data. In this paper some of component means of NMM are modelled by a linear regression model so that its design matrix presents the pattern between sample classes in microarray matrix. This modelling for the component means by given design matrices certainly has an advantage that we can lead the clusters that are previously designed. However, it suffers from 'overfitting' problem because in practice genes often are highly dimensional. This problem also arises when the NMM restricted by the linear model for component-means is fitted. To cope with this problem, in this paper, the use of the factor analyzer NMM restricted by linear model is proposed to cluster genes. Also several design matrices which are useful for clustering genes are provided.