• Title/Summary/Keyword: relative hierarchical clustering

Search Result 15, Processing Time 0.022 seconds

Microarray data analysis using relative hierarchical clustering (상대적 계층적 군집 방법을 이용한 마이크로어레이 자료의 군집분석)

  • Woo, Sook Young;Lee, Jae Won;Jhun, Myoungshic
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.999-1009
    • /
    • 2014
  • Hierarchical clustering analysis helps easily exploring massive microarray data and understanding biological phenomena with dendrogram. But, because hierarchical clustering algorithms only consider the absolute similarity, it is difficult to illustrate a relative dissimilarity, which consider not only the distance between a pair of clusters, but also how distant are they from the rest of the clusters. In this study, we introduced the relative hierarchical clustering method proposed by Mollineda and Vidal (2000) and compared hierarchical clustering method and relative hierarchical method using the simulated data and the real data in the various situations. The evaluation of the quality of two hierarchical methods was performed using percentage of incorrectly grouped points (PIGP), homogeneity and separation.

Selection of Cluster Hierarchy Depth in Hierarchical Clustering using K-Means Algorithm (K-means 알고리즘을 이용한 계층적 클러스터링에서의 클러스터 계층 깊이 선택)

  • Lee, Won-Hee;Lee, Shin-Won;Chung, Sung-Jong;An, Dong-Un
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.2
    • /
    • pp.150-156
    • /
    • 2008
  • Many papers have shown that the hierarchical clustering method takes good-performance, but is limited because of its quadratic time complexity. In contrast, with a large number of variables, K-means reduces a time complexity. Think of the factor of simplify, high-quality and high-efficiency, we combine the two approaches providing a new system named CONDOR system with hierarchical structure based on document clustering using K-means algorithm. Evaluated the performance on different hierarchy depth and initial uncertain centroid number based on variational relative document amount correspond to given queries. Comparing with regular method that the initial centroids have been established in advance, our method performance has been improved a lot.

Ant Colony Hierarchical Cluster Analysis (개미 군락 시스템을 이용한 계층적 클러스터 분석)

  • Kang, Mun-Su;Choi, Young-Sik
    • Journal of Internet Computing and Services
    • /
    • v.15 no.5
    • /
    • pp.95-105
    • /
    • 2014
  • In this paper, we present a novel ant-based hierarchical clustering algorithm, where ants repeatedly hop from one node to another over a weighted directed graph of k-nearest neighborhood obtained from a given dataset. We introduce a notion of node pheromone, which is the summation of amount of pheromone on incoming arcs to a node. The node pheromone can be regarded as a relative density measure in a local region. After a finite number of ants' hopping, we remove nodes with a small amount of node pheromone from the directed graph, and obtain a group of strongly connected components as clusters. We iteratively do this removing process from a low value of threshold to a high value, yielding a hierarchy of clusters. We demonstrate the performance of the proposed algorithm with synthetic and real data sets, comparing with traditional clustering methods. Experimental results show the superiority of the proposed method to the traditional methods.

Quality Assessment of Curcuma longa L. by Gas Chromatography-Mass Spectrometry Fingerprint, Principle Components Analysis and Hierarchical Clustering Analysis

  • Li, Ming;Zhou, Xin;Zhao, Yang;Wang, Dao-Ping;Hu, Xiao-Na
    • Bulletin of the Korean Chemical Society
    • /
    • v.30 no.10
    • /
    • pp.2287-2293
    • /
    • 2009
  • Gas Chromatography-Mass Spectrometry (GC-MS) fingerprint analysis, Principle Components Analysis (PCA), and Hierarchical Cluster Analysis (HCA) were introduced for quality assessment of Curcuma longa L. (C. longa). The GC-MS fingerprint method was developed and validated by analyzing 33 batches of samples of C. longa from different geographic locations. 18 chromatographic peaks were selected as characteristic peaks and their relative peak areas (RPA) were calculated for quantitative expression. Two principal components (PCs) were extracted by PCA. C. longa collected from Guizhou and Fujian were separated from other samples by PC1, capturing 71.83% of variance. While, PC2 contributed for their further separation, capturing 11.13% of variance. HCA confirmed the result of PCA analysis. Therefore, GC-MS fingerprint study with chemometric techniques provides a very flexible and reliable method for quality assessment of C. longa.

Evaluation of Multivariate Stream Data Reduction Techniques (다변량 스트림 데이터 축소 기법 평가)

  • Jung, Hung-Jo;Seo, Sung-Bo;Cheol, Kyung-Joo;Park, Jeong-Seok;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.889-900
    • /
    • 2006
  • Even though sensor networks are different in user requests and data characteristics depending on each application area, the existing researches on stream data transmission problem focus on the performance improvement of their methods rather than considering the original characteristic of stream data. In this paper, we introduce a hierarchical or distributed sensor network architecture and data model, and then evaluate the multivariate data reduction methods suitable for user requirements and data features so as to apply reduction methods alternatively. To assess the relative performance of the proposed multivariate data reduction methods, we used the conventional techniques, such as Wavelet, HCL(Hierarchical Clustering), Sampling and SVD (Singular Value Decomposition) as well as the experimental data sets, such as multivariate time series, synthetic data and robot execution failure data. The experimental results shows that SVD and Sampling method are superior to Wavelet and HCL ia respect to the relative error ratio and execution time. Especially, since relative error ratio of each data reduction method is different according to data characteristic, it shows a good performance using the selective data reduction method for the experimental data set. The findings reported in this paper can serve as a useful guideline for sensor network application design and construction including multivariate stream data.

Localized Positioning method for Optimal path Hierarchical clustering algorithm in Ad hoc network (에드 혹 네트워크에서 노드의 국부 위치 정보를 이용한 최적 계층적 클러스터링 경로 라우팅 알고리즘)

  • Oh, Young-Jun;Lee, Kang-Whan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.11
    • /
    • pp.2550-2556
    • /
    • 2012
  • We proposed the energy-efficient routing algorithm ALPS (Ad hoc network Localized Positioning System) algorithm that is range-free based on the distance information. The routing coordinate method of ALPS algorithm consists of hierarchical cluster routing that provides immediately relative coordinate location using RSSI(Received Signal Strength Indication) information. Existing conventional DV-hop algorithm also to manage based on normalized the range free method, the proposed hierarchical cluster routing algorithm simulation results show more optimized energy consumption sustainable path routing technique to improve the network management.

Classification of Daily Precipitation Patterns in South Korea using Mutivariate Statistical Methods

  • Mika, Janos;Kim, Baek-Jo;Park, Jong-Kil
    • Journal of Environmental Science International
    • /
    • v.15 no.12
    • /
    • pp.1125-1139
    • /
    • 2006
  • The cluster analysis of diurnal precipitation patterns is performed by using daily precipitation of 59 stations in South Korea from 1973 to 1996 in four seasons of each year. Four seasons are shifted forward by 15 days compared to the general ones. Number of clusters are 15 in winter, 16 in spring and autumn, and 26 in summer, respectively. One of the classes is the totally dry day in each season, indicating that precipitation is never observed at any station. This is treated separately in this study. Distribution of the days among the clusters is rather uneven with rather low area-mean precipitation occurring most frequently. These 4 (seasons)$\times$2 (wet and dry days) classes represent more than the half (59 %) of all days of the year. On the other hand, even the smallest seasonal clusters show at least $5\sim9$ members in the 24 years (1973-1996) period of classification. The cluster analysis is directly performed for the major $5\sim8$ non-correlated coefficients of the diurnal precipitation patterns obtained by factor analysis In order to consider the spatial correlation. More specifically, hierarchical clustering based on Euclidean distance and Ward's method of agglomeration is applied. The relative variance explained by the clustering is as high as average (63%) with better capability in spring (66%) and winter (69 %), but lower than average in autumn (60%) and summer (59%). Through applying weighted relative variances, i.e. dividing the squared deviations by the cluster averages, we obtain even better values, i.e 78 % in average, compared to the same index without clustering. This means that the highest variance remains in the clusters with more precipitation. Besides all statistics necessary for the validation of the final classification, 4 cluster centers are mapped for each season to illustrate the range of typical extremities, paired according to their area mean precipitation or negative pattern correlation. Possible alternatives of the performed classification and reasons for their rejection are also discussed with inclusion of a wide spectrum of recommended applications.

Quality Assessment of Ijung-tang Preparations Using a HPLC Analysis (HPLC 분석법을 이용한 이중탕(理中湯) 제제의 품질평가)

  • Ha, Woo-Ram;Park, Jin-Hyung;Yun, Dong-In;Lee, Jang-Cheon;Kim, Jung-Hoon
    • The Korea Journal of Herbology
    • /
    • v.31 no.3
    • /
    • pp.29-35
    • /
    • 2016
  • Objectives : Ijung-tang (IJT) is a traditional herbal formula and has been used to treat digestive diseases such as abdominal pain, vomiting, and diarrhea. IJT consists of four herbal medicines, Ginseng radix, Atractylodis rhizoma alba, Zingiberis rhizoma, and Glycyrrhizae radix et rhizoma, containing various bioactive compounds. Quality assesment of IJT preparations was performed by analytical method for determining marker compounds.Methods : Determination of seven marker compounds in IJT preparations was quantitatively conducted by high-performance liquid chromatography equipped with a diode-array detector. The marker compounds were separated on a reversed-phase C18 column and the analytical method was successfully validated. Chemometric analysis was performed to compare IJT water extracts and commercial IJT granules.Results : Limit of detection and limit of quantification values were in the ranges of 0.093-2.649 μg/mL and 0.283-8.027 μg/mL, respectively. Precisions were 0.30-3.87% within a day and 0.23-2.35% over three consecutive days. Recoveries of the marker compounds ranged from 87.35-107.05%, with relative standard deviation (RSD) values < 6.15%. Repeatabilities were < 1.20% and < 1.71% of RSD value for retention time and absolute peak area, respectively. The results from quantitative analysis showed that the quantities of seven marker compounds of IJT samples varied, as were found in principal component analysis and hierarchical clustering analysis.Conclusions : The analytical method developed in the present study was precise and reliable to simultaneously determine marker compounds of IJT. Therefore, it can be used for the quality assessment of IJT preparations.

Classification of Ambient Particulate Samples Using Cluster Analysis and Disjoint Principal Component Analysis (군집분석법과 분산주성분분석법을 이용한 대기분진시료의 분류)

  • 유상준;김동술
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.13 no.1
    • /
    • pp.51-63
    • /
    • 1997
  • Total suspended particulate matters in the ambient air were analyzed for eight chemical elements (Ca, Co, Cu, Fe, Mn, Pb, Si, and Zn) using an x-ray fluorescence spectrometry (XRF) at the Kyung Hee University - Suwon Campus during 1989 to 1994. To use these data as basis for source identification study, membership of each sample was selected to represent one of the well defined sample groups. The data sets consisting of 83 objects and 8 variables were initially separated into two groups, fine (d$_{p}$<3.3 ${\mu}{\textrm}{m}$) and coarse particle groups (d$_{p}$>3.3 ${\mu}{\textrm}{m}$). A hierarchical clustering method was examined to obtain possible member of homogeneous sample classes for each of the two groups by transforming raw data and by applying various distances. A disjoint principal component analysis was then used to define homogeneous sample classes after deleting outliers. Each of five homogeneous sample classes was determined for the fine and the coarse particle group, respectively. The data were properly classified via an application of logarithmic transformation and Euclidean distance concept. After determining homogeneous classes, correlation coefficients among eight chemical variables within all the homogeneous classes for calculated and meteorological variables (temperature. relative humidity, wind speed, wind direction, and precipitation) were examined as well to intensively interpret environmental factors influencing the characteristics of each class for each group. According to our analysis, we found that each class had its own distinct seasonal pattern that was affected most sensitively by wind direction.ion.

  • PDF

Effect of Herbicide Combinations on Bt-Maize Rhizobacterial Diversity

  • Valverde, Jose R.;Marin, Silvia;Mellado, Rafael P.
    • Journal of Microbiology and Biotechnology
    • /
    • v.24 no.11
    • /
    • pp.1473-1483
    • /
    • 2014
  • Reports of herbicide resistance events are proliferating worldwide, leading to new cultivation strategies using combinations of pre-emergence and post-emergence herbicides. We analyzed the impact during a one-year cultivation cycle of several herbicide combinations on the rhizobacterial community of glyphosate-tolerant Bt-maize and compared them to those of the untreated or glyphosate-treated soils. Samples were analyzed using pyrosequencing of the V6 hypervariable region of the 16S rRNA gene. The sequences obtained were subjected to taxonomic, taxonomy-independent, and phylogeny-based diversity studies, followed by a statistical analysis using principal components analysis and hierarchical clustering with jackknife statistical validation. The resilience of the microbial communities was analyzed by comparing their relative composition at the end of the cultivation cycle. The bacterial communites from soil subjected to a combined treatment with mesotrione plus s-metolachlor followed by glyphosate were not statistically different from those treated with glyphosate or the untreated ones. The use of acetochlor plus terbuthylazine followed by glyphosate, and the use of aclonifen plus isoxaflutole followed by mesotrione clearly affected the resilience of their corresponding bacterial communities. The treatment with pethoxamid followed by glyphosate resulted in an intermediate effect. The use of glyphosate alone seems to be the less aggressive one for bacterial communities. Should a combined treatment be needed, the combination of mesotrione and s-metolachlor shows the next best final resilience. Our results show the relevance of comparative rhizobacterial community studies when novel combined herbicide treatments are deemed necessary to control weed growth.