• Title/Summary/Keyword: Ward's method

Search Result 21, Processing Time 0.095 seconds

Customer Load Pattern Analysis using Clustering Techniques (클러스터링 기법을 이용한 수용가별 전력 데이터 패턴 분석)

  • Ryu, Seunghyoung;Kim, Hongseok;Oh, Doeun;No, Jaekoo
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.1
    • /
    • pp.61-69
    • /
    • 2016
  • Understanding load patterns and customer classification is a basic step in analyzing the behavior of electricity consumers. To achieve that, there have been many researches about clustering customers' daily load data. Nowadays, the deployment of advanced metering infrastructure (AMI) and big-data technologies make it easier to study customers' load data. In this paper, we study load clustering from the view point of yearly and daily load pattern. We compare four clustering methods; K-means clustering, hierarchical clustering (average & Ward's method) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise). We also discuss the relationship between clustering results and Korean Standard Industrial Classification that is one of possible labels for customers' load data. We find that hierarchical clustering with Ward's method is suitable for clustering load data and KSIC can be well characterized by daily load pattern, but not quite well by yearly load pattern.

Application of Multivariate Statistics for Characterization of Sensory Properties in Pre-cooked Foods (다변수 통계법을 이용한 조리식품의 관능특성 연구)

  • Yoon, Hee-Nam
    • Korean Journal of Food Science and Technology
    • /
    • v.23 no.6
    • /
    • pp.711-716
    • /
    • 1991
  • Various multivariate statistics were applied to determine the relationships between sensory properties of 9 pre-cooked foods. Twelve sensory terms were selected to differentiate the food samples in stepwise discriminant analysis. Three factors accounted for 61.9% of total variation of 12 sensory attributes detected. Factor I was highly related to the qualitative sensory terms, while factor II to the quantitative ones. The principal component plot made it possible to define the relationships between sensory properties and food samples. In cluster analysis using average linkage and Ward's method, nine pre-cooked foods were classified into three clusters in terms of their sensorial similarities.

  • PDF

Automatic Generation of the Local Level Knowledge Structure of a Single Document Using Clustering Methods (클러스터링 기법을 이용한 개별문서의 지식구조 자동 생성에 관한 연구)

  • Han, Seung-Hee;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.3
    • /
    • pp.251-267
    • /
    • 2004
  • The purpose of this study is to generate the local level knowledge structure of a single document, similar to end-of-the-book indexes and table of contents of printed material through the use of term clustering and cluster representative term selection. Furthermore, it aims to analyze the functionalities of the knowledge structure. and to confirm the applicability of these methods in user-friend1y information services. The results of the term clustering experiment showed that the performance of the Ward's method was superior to that of the fuzzy K -means clustering method. In the cluster representative term selection experiment, using the highest passage frequency term as the representative yielded the best performance. Finally, the result of user task-based functionality tests illustrate that the automatically generated knowledge structure in this study functions similarly to the local level knowledge structure presented In printed material.

Classification of Climate Zones in South Korea Considering both Air Temperature and Rainfall (기온과 강수특성을 고려한 남한의 기후지역구분)

  • Park, Chang-Yong;Choi, Young-Eun;Moon, Ja-Yeon;Yun, Won-Tae
    • Journal of the Korean Geographical Society
    • /
    • v.44 no.1
    • /
    • pp.1-16
    • /
    • 2009
  • This study aims to classify climate zones using Empirical Orthogonal Function and clustering analyses considering both air temperature and rainfall features in South Korea. When examining climatic characteristics of air temperature and rainfall by seasons, the distribution of air temperature is affected by topography and latitude for all seasons in South Korea. The distribution of rainfall demonstrated that the Yeongdong area, the southern coastal area and Jeju island have higher rainfall while the central area in Gyeongsangbuk-do is the least rainfall area. Clustering analyses of average linkage method and Ward's method was carried out using input variables derived from principal component scores calculated through Empirical Orthogonal Function analysis for air temperature and rainfall. Ward's method showed the best result of classification of climate zones. It was well reflected effects of topography, latitude, sea, the movement of surface pressure systems, and an administrative district.

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

  • Kim, Jae-Hee;Ko, Yoon-Sil
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.745-758
    • /
    • 2009
  • Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.

A Preliminary Study for Vulnerability Assessment to Natural Hazards in Gyeongsangnam-do (경남 시군별 자연재해 취약성 평가 및 유형 분류)

  • Kim, Sung Jae;Kim, Yong Wan;Choi, Young Wan;Kim, Sung Min;Jang, Min Won
    • KCID journal
    • /
    • v.19 no.1
    • /
    • pp.97-105
    • /
    • 2012
  • This study aimed to evaluate the vulnerability to different natural hazards such as flood, drought, and abnormal climate, and to classify the vulnerability patterns in Gyeongsangnam-do. The damage records and annual budgets during 2000 to 2009 were collected and were ranked for all twelve si-guns. Sancheong-gun and Hamyang-gun resulted in the most vulnerable to flood and drought damages, and Hadong-gun and Yangsan-si were most damaged from abnormal climate such as heavy snow and heavy wind. In addition, three clusters were classified by using Ward's method, and were interpreted. The results showed that the western areas of Gyeongsangnam-do might be more vulnerable to flood damage while drought might threaten the eastern si-guns.

  • PDF

A Classification of Luxury Fashion Brands' E-commerce Sites

  • Kim, Sunghee
    • Journal of Fashion Business
    • /
    • v.17 no.6
    • /
    • pp.125-140
    • /
    • 2013
  • The aim of this study was to analyze e-commerce sites of luxury fashion brands in order to provide insights on how to enhance online site quality. For the research, forty-eight components of thirty-one luxury fashion brands' e-commerce sites were investigated during October 2013. For the analysis of clustering e-commerce site components and segmenting e-commerce sites of luxury brands, a hierarchical cluster analysis was applied through using the Ward's method and squared Euclidian distance for binary data. Further, Fisher's exact test was applied in order to distinguish three groups of characteristics in the luxury e-commerce sites. These analyses were carried out by SPSS 21. The result indicated that the components of e-commerce sites were grouped into three categories: basic elements, additional elements and elements of building brand identity. These components were categorized by whether their functions were basic and essential or additional and advanced. The other norm of categorization was related to brand identity. Furthermore, the luxury brands' e-commerce sites were segmented into three groups: a group of endeavoring to promote goods, a group of undistinguished performance, and a group of endeavoring to intensify brand identity. In this segmentation, brand identity or promotional aspects were decisive. Overall, luxury brands were trying to convey their traditional strength through their e-commerce sites. In order to achieve this purpose, brand identity or promotional aspects played an important role.

Document Clustering Using Reference Titles (인용문헌 표제를 이용한 문헌 클러스터링에 관한 연구)

  • Choi, Sang-Hee
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.2
    • /
    • pp.241-252
    • /
    • 2010
  • Titles have been regarded as having effective clustering features, but they sometimes fail to represent the topic of a document and result in poorly generated document clusters. This study aims to improve the performance of document clustering with titles by suggesting titles in the citation bibliography as a clustering feature. Titles of original literature, titles in the citation bibliography, and an aggregation of both titles were adapted to measure the performance of clustering. Each feature was combined with three hierarchical clustering methods, within group average linkage, complete linkage, and Ward's method in the clustering experiment. The best practice case of this experiment was clustering document with features from both titles by within-groups average method.

Active Learning based on Hierarchical Clustering (계층적 군집화를 이용한 능동적 학습)

  • Woo, Hoyoung;Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.705-712
    • /
    • 2013
  • Active learning aims to improve the performance of a classification model by repeating the process to select the most helpful unlabeled data and include it to the training set through labelling by expert. In this paper, we propose a method for active learning based on hierarchical agglomerative clustering using Ward's linkage. The proposed method is able to construct a training set actively so as to include at least one sample from each cluster and also to reflect the total data distribution by expanding the existing training set. While most of existing active learning methods assume that an initial training set is given, the proposed method is applicable in both cases when an initial training data is given or not given. Experimental results show the superiority of the proposed method.

Characteristics Detection of Hydrological and Water Quality Data in Jangseong Reservoir by Application of Pattern Classification Method (패턴분류 방법 적용에 의한 장성호 수문·수질자료의 특성파악)

  • Park, Sung-Chun;Jin, Young-Hoon;Roh, Kyong-Bum;Kim, Jongo;Yu, Ho-Gyu
    • Journal of Korean Society on Water Environment
    • /
    • v.27 no.6
    • /
    • pp.794-803
    • /
    • 2011
  • Self Organizing Map (SOM) was applied for pattern classification of hydrological and water quality data measured at Jangseong Reservoir on a monthly basis. The primary objective of the present study is to understand better data characteristics and relationship between the data. For the purpose, two SOMs were configured by a methodologically systematic approach with appropriate methods for data transformation, determination of map size and side lengths of the map. The SOMs constructed at the respective measurement stations for water quality data (JSD1 and JSD2) commonly classified the respective datasets into five clusters by Davies-Bouldin Index (DBI). The trained SOMs were fine-tuned by Ward's method of a hierarchical cluster analysis. On the one hand, the patterns with high values of standardized reference vectors for hydrological variables revealed the high possibility of eutrophication by TN or TP in the reservoir, in general. On the other hand, the clusters with low values of standardized reference vectors for hydrological variables showed the patterns with high COD concentration. In particular, Clsuter1 at JSD1 and Cluster5 at JSD2 represented the worst condition of water quality with high reference vectors for rainfall and storage in the reservoir. Consequently, SOM is applicable to identify the patterns of potential eutrophication in reservoirs according to the better understanding of data characteristics and their relationship.