• Title/Summary/Keyword: Average Linkage Method

Search Result 44, Processing Time 0.022 seconds

Hierarchic Document Clustering in OPAC (OPAC에서 자동분류 열람을 위한 계층 클러스터링 연구)

  • 노정순
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.1
    • /
    • pp.93-117
    • /
    • 2004
  • This study is to develop a hierarchic clustering model fur document classification and browsing in OPAC systems. Two automatic indexing techniques (with and without controlled terms), two term weighting methods (based on term frequency and binary weight), five similarity coefficients (Dice, Jaccard, Pearson, Cosine, and Squared Euclidean). and three hierarchic clustering algorithms (Between Average Linkage, Within Average Linkage, and Complete Linkage method) were tested on the document collection of 175 books and theses on library and information science. The best document clusters resulted from the Between Average Linkage or Complete Linkage method with Jaccard or Dice coefficient on the automatic indexing with controlled terms in binary vector. The clusters from Between Average Linkage with Jaccard has more likely decimal classification structure.

Document Clustering Using Reference Titles (인용문헌 표제를 이용한 문헌 클러스터링에 관한 연구)

  • Choi, Sang-Hee
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.2
    • /
    • pp.241-252
    • /
    • 2010
  • Titles have been regarded as having effective clustering features, but they sometimes fail to represent the topic of a document and result in poorly generated document clusters. This study aims to improve the performance of document clustering with titles by suggesting titles in the citation bibliography as a clustering feature. Titles of original literature, titles in the citation bibliography, and an aggregation of both titles were adapted to measure the performance of clustering. Each feature was combined with three hierarchical clustering methods, within group average linkage, complete linkage, and Ward's method in the clustering experiment. The best practice case of this experiment was clustering document with features from both titles by within-groups average method.

Detection of QTL on Bovine X Chromosome by Exploiting Linkage Disequilibrium

  • Kim, Jong-Joo
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.21 no.5
    • /
    • pp.617-623
    • /
    • 2008
  • A fine-mapping method exploiting linkage disequilibrium was used to detect quantitative trait loci (QTL) on the X chromosome affecting milk production, body conformation and productivity traits. The pedigree comprised 22 paternal half-sib families of Black-and-White Holstein bulls in the Netherlands in a grand-daughter design for a total of 955 sons. Twenty-five microsatellite markers were genotyped to construct a linkage map on the chromosome X spanning 170 Haldane cM with an average inter-marker distance of 7.1 cM. A covariance matrix including elements about identical-by-descent probabilities between haplotypes regarding QTL allele effects was incorporated into the animal model, and a restricted maximum-likelihood method was applied for the presence of QTL using the LDVCM program. Significance thresholds were obtained by permuting haplotypes to phenotypes and by using a false discovery rate procedure. Seven QTL responsible for conformation types (teat length, rump width, rear leg set, angularity and fore udder attachment), behavior (temperament) and a mixture of production and health (durable prestation) were detected at the suggestive level. Some QTL affecting teat length, rump width, durable prestation and rear leg set had small numbers of haplotype clusters, which may indicate good classification of alleles for causal genes or markers that are tightly associated with the causal mutation. However, higher maker density is required to better refine the QTL position and to better characterize functionally distinct haplotypes which will provide information to find causal genes for the traits.

A genome-wide association study (GWAS) for pH value in the meat of Berkshire pigs

  • Park, Jun;Lee, Sang-Min;Park, Ja-Yeon;Na, Chong-Sam
    • Journal of Animal Science and Technology
    • /
    • v.63 no.1
    • /
    • pp.25-35
    • /
    • 2021
  • The purpose of this study is to estimate the single nucleotide polymorphism (SNP) effect for pH values affecting Berkshire meat quality. A total of 39,603 SNPs from 1,978 heads after quality control and 882 pH values were used estimate SNP effect by single step genomic best linear unbiased prediction (ssGBLUP) method. The average physical distance between adjacent SNP pairs was 61.7kbp and the number and proportion of SNPs whose minor allele frequency was below 10% were 9,573 and 24.2%, respectively. The average of observed heterozygosity and polymorphic information content was 0.32 ± 0.16 and 0.26 ± 0.11, respectively and the estimate for average linkage disequilibrium was 0.40. The heritability of pH45m and pH24h were 0.10 and 0.15 respectively. SNPs with an absolute value more than 4 standard deviations from the mean were selected as threshold markers, among the selected SNPs, protein-coding genes of pH45m and pH24h were detected in 6 and 4 SNPs, respectively. The distribution of coding genes were detected at pH45m and were detected at pH24h.

Application of Multivariate Statistics for Characterization of Sensory Properties in Pre-cooked Foods (다변수 통계법을 이용한 조리식품의 관능특성 연구)

  • Yoon, Hee-Nam
    • Korean Journal of Food Science and Technology
    • /
    • v.23 no.6
    • /
    • pp.711-716
    • /
    • 1991
  • Various multivariate statistics were applied to determine the relationships between sensory properties of 9 pre-cooked foods. Twelve sensory terms were selected to differentiate the food samples in stepwise discriminant analysis. Three factors accounted for 61.9% of total variation of 12 sensory attributes detected. Factor I was highly related to the qualitative sensory terms, while factor II to the quantitative ones. The principal component plot made it possible to define the relationships between sensory properties and food samples. In cluster analysis using average linkage and Ward's method, nine pre-cooked foods were classified into three clusters in terms of their sensorial similarities.

  • PDF

Evaluation of a New Fine-mapping Method Exploiting Linkage Disequilibrium: a Case Study Analysing a QTL with Major Effect on Milk Composition on Bovine Chromosome 14

  • Kim, JongJoo;Georges, Michel
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.15 no.9
    • /
    • pp.1250-1256
    • /
    • 2002
  • A novel fine-mapping method exploiting linkage disequilibrium (LD) was applied to better refine the quantitative trait loci (QTL) positions for milk production traits on bovine chromosome 14 in the pedigree comprising 22 paternal half-sib families of a Black-and-White Holstein-Friesian grand-daughter design in the Netherlands for a total of 1,034 sons. The chromosome map was constructed with the 31 genetic markers spanning 90 Kosambi cM with the average inter-marker distance of 3.5 cM. The linkage analyses, in which the effects of sire QTL alleles were assumed random and the random factor of the QTL allelic effects was incorporated into the Animal Model, found the QTL for milk, fat, and protein yield and fat and protein % with the Lod scores of 10.9, 2.3, 6.0, 25.4 and 3.2, respectively. The joint analyses including LD information by use of multi-marker haplotypes highly increased the evidence of the QTL (Lod scores were 25.1, 20.9, 11.0, 85.7 and 17.4 for the corresponding traits, respectively). The joint analyses including DGAT markers in the defined haplotypes again increased the QTL evidence and the most likely QTL positions for the five traits coincided with the position of the DGAT gene, supporting the hypothesis of the direct causal involvement of the DGAT gene. This study strongly indicates that the exploitation of LD information will allow additional gains of power and precision in finding and localising QTL of interest in livestock species, on the condition of high marker density around the QTL region.

A Linkage Method for the Life Cycle Cost Breakdown Structure through an Analysis of Boundary Conditions (경계조건 분석을 통한 LCCBS 연계방안)

  • Jeong, Jae-Hyuk;Kim, Tae-Hui
    • Journal of the Korea Institute of Building Construction
    • /
    • v.13 no.4
    • /
    • pp.321-332
    • /
    • 2013
  • Costs and expenses are intertwined and incurred throughout an entire construction project, even from the pre-construction phase, and each phase has a different impact on the life cycle cost (LCC). However, the cost breakdown structure (CBS) is different in each phase of a building construction project, which makes it hard to reasonably calculate construction cost. For this reason, the boundary conditions were analyzed in this study based on the life cycle cost break structure (LCCBS). In addition, breakdown factors were analyzed based on the boundary conditions to derive a linkage method. The validity of the linkage method was verified through application to actual construction projects. Through the analysis, it was found that the problem of items being left out was reduced by more than 97.2 percent, and the work was done an average of 6 hours faster compared to the conventional method. It is expected that by applying the new LCC system, LCC will be both reduced and calculated in a more efficient manner.

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

  • Kim, Jae-Hee;Ko, Yoon-Sil
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.745-758
    • /
    • 2009
  • Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.

Motion Vector Recovery Scheme for H.264/AVC (H.264/AVC을 위한 움직임 벡터 복원 방법)

  • Son, Nam-Rye
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.5
    • /
    • pp.29-37
    • /
    • 2008
  • To transmit video bit stream over low bandwidth such as wireless channel, high compression algorithm like H.264 codec is exploited. In transmitting high compressed video bit-stream over low bandwidth, packet loss causes severe degradation in image quality. In this paper, a new algorithm for recovery of missing or erroneous motion vector is proposed. Considering that the missing or erroneous motion vectors in blocks are closely correlated with those of neighboring blocks. Motion vector of neighboring blocks are clustered according to average linkage algorithm clustering and a representative value for each cluster is determined to obtain the candidate motion vector sets. As a result, simulation results show that the proposed method dramatically improves processing time compared to existing H.264/AVC. Also the proposed method is similar to existing H.264/AVC in terms of visual quality.

An Empirical Comparison and Verification Study on the Containerports Clustering Measurement Using K-Means and Hierarchical Clustering(Average Linkage Method Using Cross-Efficiency Metrics, and Ward Method) and Mixed Models (K-Means 군집모형과 계층적 군집(교차효율성 메트릭스에 의한 평균연결법, Ward법)모형 및 혼합모형을 이용한 컨테이너항만의 클러스터링 측정에 대한 실증적 비교 및 검증에 관한 연구)

  • Park, Ro-Kyung
    • Journal of Korea Port Economic Association
    • /
    • v.34 no.3
    • /
    • pp.17-52
    • /
    • 2018
  • The purpose of this paper is to measure the clustering change and analyze empirical results. Additionally, by using k-means, hierarchical, and mixed models on Asian container ports over the period 2006-2015, the study aims to form a cluster comprising Busan, Incheon, and Gwangyang ports. The models consider the number of cranes, depth, birth length, and total area as inputs and container twenty-foot equivalent units(TEU) as output. Following are the main empirical results. First, ranking order according to the increasing ratio during the 10 years analysis shows that the value for average linkage(AL), mixed ward, rule of thumb(RT)& elbow, ward, and mixed AL are 42.04% up, 35.01% up, 30.47%up, and 23.65% up, respectively. Second, according to the RT and elbow models, the three Korean ports can be clustered with Asian ports in the following manner: Busan Port(Hong Kong, Guangzhou, Qingdao, and Singapore), Incheon Port(Tokyo, Nagoya, Osaka, Manila, and Bangkok), and Gwangyang Port(Gungzhou, Ningbo, Qingdao, and Kasiung). Third, optimal clustering numbers are as follows: AL(6), Mixed Ward(5), RT&elbow(4), Ward(5), and Mixed AL(6). Fourth, empirical clustering results match with those of questionnaire-Busan Port(80%), Incheon Port(17%), and Gwangyang Port(50%). The policy implication is that related parties of Korean seaports should introduce port improvement plans like the benchmarking of clustered seaports.