• 제목/요약/키워드: Ward's Hierarchical Clustering Analysis

검색결과 8건 처리시간 0.023초

클러스터링 기법을 이용한 수용가별 전력 데이터 패턴 분석 (Customer Load Pattern Analysis using Clustering Techniques)

  • 유승형;김홍석;오도은;노재구
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제2권1호
    • /
    • pp.61-69
    • /
    • 2016
  • Understanding load patterns and customer classification is a basic step in analyzing the behavior of electricity consumers. To achieve that, there have been many researches about clustering customers' daily load data. Nowadays, the deployment of advanced metering infrastructure (AMI) and big-data technologies make it easier to study customers' load data. In this paper, we study load clustering from the view point of yearly and daily load pattern. We compare four clustering methods; K-means clustering, hierarchical clustering (average & Ward's method) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise). We also discuss the relationship between clustering results and Korean Standard Industrial Classification that is one of possible labels for customers' load data. We find that hierarchical clustering with Ward's method is suitable for clustering load data and KSIC can be well characterized by daily load pattern, but not quite well by yearly load pattern.

효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석 (Gene Screening and Clustering of Yeast Microarray Gene Expression Data)

  • 이경아;김태훈;김재희
    • 응용통계연구
    • /
    • 제24권6호
    • /
    • pp.1077-1094
    • /
    • 2011
  • 마이크로어레이 유전자 발현 데이터인 yeast cdc15에 대해 시계열 데이터의 특성을 반영한 푸리에 계수를 이용한 검정통계량과 FDR 다중비교법을 이용하여 차별화된 유전자를 선별한 후 선별된 유전자들에 대해 모형기반 군집방법, K-평균법, PAM, SOM, 계층적 Ward 군집방법과 Fuzzy 군집방법을 실시하였다. 군집방법에 따른 특성을 알아보고 군집화 결과와 내부유효성 측도로 연결성 측도, Dunn 지수와 실루엣 값을 살펴본다. 또한 GO분석을 통한 생물학적 의미도 파악해본다.

일반국도 도로특성분류를 위한 통계적 군집분석과 Kohonen Self-Organizing Maps의 비교연구 (A Comparative Study on Statistical Clustering Methods and Kohonen Self-Organizing Maps for Highway Characteristic Classification of National Highway)

  • 조준한;김성호
    • 대한토목학회논문집
    • /
    • 제29권3D호
    • /
    • pp.347-356
    • /
    • 2009
  • 본 연구는 기존의 도로기능분류 정의와 방법론을 벗어나 교통특성에 따른 도로분류 방법론인 도로특성분류를 기초로 분석을 수행하였다. 도로특성분류에 대한 일련의 과정 중에서 다양한 교통특성을 반영하는 설명변수를 기초로 요인점수를 산출하고, 동질한 도로구간을 그룹핑하는 군집화 분석과정과 적정 군집수 도출에 따른 군집결과비교에 본 연구는 초점을 맞추었다. 도로분류를 위해 병합적 계층 군집분석인 Ward법, 비계층적 군집분석인 K-means법, 자율신경 회로망을 이용한 K-SOM을 사용하여 비교분석하였다. 각 군집기법에 대한 결과를 토대로 비교분석한 결과, 군집 수 5 이하에서는 K-means법, 군집 수 14 이상에서는 Kohonen selforganizing maps가 가장 우수한 것으로 나타났으며, 군집수 5~9사이에서는 Ward법과 Kmeans법의 군집 성능이 불규칙한 패턴을 보임에 따라 세밀한 결과분석을 통해 우수성을 결정하는 것이 바람직할 것으로 분석되었다. 본 연구결과는 다양한 교통특성을 고려한 도로구간의 군집 속성을 분석하고 예측하는 분류화 작업에 중요한 기초적인 자료로 사용될 것으로 기대된다.

고속철도 열차지연 유형의 구분지표 및 기준 (Types of Train Delay of High-Speed Rail : Indicators and Criteria for Classification)

  • 김한수;강중혁;배영규
    • 한국경영과학회지
    • /
    • 제38권3호
    • /
    • pp.37-50
    • /
    • 2013
  • The purpose of this study is to determine the indicators and the criteria to classify types of train delays of high-speed rail in South Korea. Types of train delays have divided into the chronic delays and the knock-on delays. The Indicators based on relevance, reliability, and comparability were selected with arrival delay rate of over five minutes, median of arrival delays of preceding train and following train, knock-on delay rate of over five minutes, correlation of delay between preceding train and following train on intermediate and last stations, average train headway, average number of passengers per train, and average seat usages. Types of train delays were separated using the Ward's hierarchical cluster analysis. The criteria for classification of train delay were presented by the Fisher's linear discriminant. The analysis on the situational characteristics of train delays is as follows. If the train headway in last station is short, the probability of chronic delay is high. If the planned running times of train is short, the seriousness of chronic delay is high. The important causes of train delays are short headway of train, shortly planned running times, delays of preceding train, and the excessive number of passengers per train.

간성뇌증 환자의 뇌 자기공명영상에서 대칭적인 지역 뇌부종 양상의 군집화 (Pattern Clustering of Symmetric Regional Cerebral Edema on Brain MRI in Patients with Hepatic Encephalopathy)

  • 임춘근;이희중
    • 대한영상의학회지
    • /
    • 제85권2호
    • /
    • pp.381-393
    • /
    • 2024
  • 목적 간성뇌증(hepatic encephalopathy; 이하 HE)의 대사이상은 뇌부종 또는 탈수초성 질환을 일으켜 자기공명영상에서 대칭적인 지역 뇌부종을 유발한다. 본 연구에서 HE 환자의 뇌 자기공명영상에서 대칭적인 지역 뇌부종 패턴의 군집화 분석을 통해 뇌부전 발생 예측의 유용성을 조사하는 것을 목적으로 한다. 대상과 방법 연속적인 HE 환자 98명을 대상으로 MR 소견과 임상자료를 후향적으로 분석하였다. Symmetric regional cerebral edema (이하 SRCE)의 12개 영역 간의 상관관계는 파이(φ) 계수를 사용하여 계산하였고, φ2 거리 측정과 Ward의 방법을 사용하여 계층적 군집화를 사용하여 패턴을 분류하였다. SRCE의 분류된 패턴은 말기 간 질환 모델(model for endstage liver disease; 이하 MELD) 점수 및 HE 등급과 같은 임상과 상관관계를 조사하였다. 결과 적색 핵과 뇌량(φ = 0.81, p < 0.001), 대뇌 십자 및 적색 핵(φ = 0.72, p < 0.001), 적색핵과 치상핵(φ = 0.66, p < 0.001)을 포함한 22쌍의 관심영역 사이에 유의한 연관성이 발견되었다. 계층적 군집화 후 24건을 I군, 35건을 II군, 39건을 III군으로 분류하였다. 그룹 III은 그룹 I에 비해 MELD 점수(p = 0.04)와 HE 등급(p = 0.002)이 더 높았다. 결론 본 연구는 HE 환자에서 대칭적인 지역 뇌부종의 패턴은 간 보존 및 뇌부전 발생을 예측하는 데 유용할 수 있음을 보여주었다.

Classification of Daily Precipitation Patterns in South Korea using Mutivariate Statistical Methods

  • Mika, Janos;Kim, Baek-Jo;Park, Jong-Kil
    • 한국환경과학회지
    • /
    • 제15권12호
    • /
    • pp.1125-1139
    • /
    • 2006
  • The cluster analysis of diurnal precipitation patterns is performed by using daily precipitation of 59 stations in South Korea from 1973 to 1996 in four seasons of each year. Four seasons are shifted forward by 15 days compared to the general ones. Number of clusters are 15 in winter, 16 in spring and autumn, and 26 in summer, respectively. One of the classes is the totally dry day in each season, indicating that precipitation is never observed at any station. This is treated separately in this study. Distribution of the days among the clusters is rather uneven with rather low area-mean precipitation occurring most frequently. These 4 (seasons)$\times$2 (wet and dry days) classes represent more than the half (59 %) of all days of the year. On the other hand, even the smallest seasonal clusters show at least $5\sim9$ members in the 24 years (1973-1996) period of classification. The cluster analysis is directly performed for the major $5\sim8$ non-correlated coefficients of the diurnal precipitation patterns obtained by factor analysis In order to consider the spatial correlation. More specifically, hierarchical clustering based on Euclidean distance and Ward's method of agglomeration is applied. The relative variance explained by the clustering is as high as average (63%) with better capability in spring (66%) and winter (69 %), but lower than average in autumn (60%) and summer (59%). Through applying weighted relative variances, i.e. dividing the squared deviations by the cluster averages, we obtain even better values, i.e 78 % in average, compared to the same index without clustering. This means that the highest variance remains in the clusters with more precipitation. Besides all statistics necessary for the validation of the final classification, 4 cluster centers are mapped for each season to illustrate the range of typical extremities, paired according to their area mean precipitation or negative pattern correlation. Possible alternatives of the performed classification and reasons for their rejection are also discussed with inclusion of a wide spectrum of recommended applications.

A Classification of Luxury Fashion Brands' E-commerce Sites

  • Kim, Sunghee
    • 패션비즈니스
    • /
    • 제17권6호
    • /
    • pp.125-140
    • /
    • 2013
  • The aim of this study was to analyze e-commerce sites of luxury fashion brands in order to provide insights on how to enhance online site quality. For the research, forty-eight components of thirty-one luxury fashion brands' e-commerce sites were investigated during October 2013. For the analysis of clustering e-commerce site components and segmenting e-commerce sites of luxury brands, a hierarchical cluster analysis was applied through using the Ward's method and squared Euclidian distance for binary data. Further, Fisher's exact test was applied in order to distinguish three groups of characteristics in the luxury e-commerce sites. These analyses were carried out by SPSS 21. The result indicated that the components of e-commerce sites were grouped into three categories: basic elements, additional elements and elements of building brand identity. These components were categorized by whether their functions were basic and essential or additional and advanced. The other norm of categorization was related to brand identity. Furthermore, the luxury brands' e-commerce sites were segmented into three groups: a group of endeavoring to promote goods, a group of undistinguished performance, and a group of endeavoring to intensify brand identity. In this segmentation, brand identity or promotional aspects were decisive. Overall, luxury brands were trying to convey their traditional strength through their e-commerce sites. In order to achieve this purpose, brand identity or promotional aspects played an important role.

Lung Function Trajectory Types in Never-Smoking Adults With Asthma: Clinical Features and Inflammatory Patterns

  • Kim, Joo-Hee;Chang, Hun Soo;Shin, Seung Woo;Baek, Dong Gyu;Son, Ji-Hye;Park, Choon-Sik;Park, Jong-Sook
    • Allergy, Asthma & Immunology Research
    • /
    • 제10권6호
    • /
    • pp.614-627
    • /
    • 2018
  • Purpose: Asthma is a heterogeneous disease that responds to medications to varying degrees. Cluster analyses have identified several phenotypes and variables related to fixed airway obstruction; however, few longitudinal studies of lung function have been performed on adult asthmatics. We investigated clinical, demographic, and inflammatory factors related to persistent airflow limitation based on lung function trajectories over 1 year. Methods: Serial post-bronchodilator forced expiratory volume (FEV) 1% values were obtained from 1,679 asthmatics who were followed up every 3 months for 1 year. First, a hierarchical cluster analysis was performed using Ward's method to generate a dendrogram for the optimum number of clusters using the complete post-FEV1 sets from 448 subjects. Then, a trajectory cluster analysis of serial post-FEV1 sets was performed using the k-means clustering for the longitudinal data trajectory method. Next, trajectory clustering for the serial post-FEV1 sets of a total of 1,679 asthmatics was performed after imputation of missing post-FEV1 values using regression methods. Results: Trajectories 1 and 2 were associated with normal lung function during the study period, and trajectory 3 was associated with a reversal to normal of the moderately decreased baseline FEV1 within 3 months. Trajectories 4 and 5 were associated with severe asthma with a marked reduction in baseline FEV1. However, the FEV1 associated with trajectory 4 was increased at 3 months, whereas the FEV1 associated with trajectory 5 was persistently disturbed over 1 year. Compared with trajectory 4, trajectory 5 was associated with older asthmatics with less atopy, a lower immunoglobulin E (IgE) level, sputum neutrophilia and higher dosages of oral steroids. In contrast, trajectory 4 was associated with higher sputum and blood eosinophil counts and more frequent exacerbations. Conclusions: Trajectory clustering analysis of FEV1 identified 5 distinct types, representing well-preserved to severely decreased FEV1. Persistent airflow obstruction may be related to non-atopy, a low IgE level, and older age accompanied by neutrophilic inflammation and low baseline FEV1 levels.