• Title/Summary/Keyword: HITS

Search Result 236, Processing Time 0.089 seconds

Improved Concept-base Search System Using HITS algorithm on Conceptual Graph (HITS알고리즘을 적용한 개념그래프 기반검색시스템의 성능개선)

  • 배환국;박호성;이상준;김기태
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.470-472
    • /
    • 2003
  • 본 논문에서는 개념 그래프 기반 검색 시스템의 검색의 성능을 개선시키고자 Hits 알고리즘을 적용하였다. 기존 개념 그래프 기반 검색 시스템의 anchor text분석을 통하여 개념을 추출하고 있는 시스템에서 더 나아가 하이퍼 링크의 선호도의 특성을 살려 하이퍼링크에 문서가 얼마나 연결되어 있는지, 참조하고 있는지에 따라 해당 검색된 문서들의 중요도를 찾아서 순위를 매기는 실험을 하였다. 종래에는 해당 검색어의 빈도순으로 개념의 결과를 나타내 주었는데, 본 시스템 구현 후에 랭킹알고리즘을 적용하여 해당검색에 유용한 정보를 가지고 있는 페이지들(authorities)과 유용한 정보를 보유하고 있는 페이지의 링크를 보유하고 있는 페이지들(hubs)를 각각 순위 순으로 보여주게 되었다. 그리하여 사용자는 실제 검색시에 개념상으로 분류된 문서 중에 중요도가 높은 문서를 사용자에게 우선으로 접하게 되었으며, hub어 의해서 중요도가 높은 문서를 한눈에 볼 수도 있을 뿐 아니라, anchor text 어서 나타나지 않은 중요한 정보를 가진 문서도 검색할 수 있었다.

  • PDF

A Link control of the word associated relation with using HITS Algorithm (HITS 알고리즘을 이용한 단어 연관 관계 링크 제어)

  • Moon, Sung-Cheon;Lee, Jung-Hun;Cheon, Suh-H.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.395-398
    • /
    • 2010
  • 많은 정보들을 인터넷을 통하여 접할 수 있게 됨에 따라 사용자가 만족하는 결과를 보여주는 것이 검색 엔진의 궁극적인 목표가 되었다. 하지만 방대한 양을 가진 다양한 정보에서 원하는 검색 결과를 검색하는 것은 과거와 현재까지 많은 연구를 통해 많은 시간과 노력이 필요하다는 것이 증명 되었다. 기존의 HITS 알고리즘을 개선하여 링크 제어를 이용한 페이지와 페이지간에 관련성을 높였다.

  • PDF

A SNS-based New Products Promotion Case Study (SNS 기반 신제품 프로모션 사례 연구)

  • Kim, Sung K.;Kim, Nam K.
    • Journal of Information Technology Applications and Management
    • /
    • v.20 no.4
    • /
    • pp.263-278
    • /
    • 2013
  • SNS users have increased at rapid rate. Many firms are expected to use SNS, especially in marketing area. This study describes a SNS-based new products promotion case study. We aim to identify how differently SNS users respond to different types of SNS media or SNS contents. In this analysis marketing effects are measured in a number of website hits. The study result shows how the number of user's hits differs upon a combination of SNS media and SNS contents.

Enhanced Threshold Algorithm for HITS on the World Wide Web (World Wide Web을 위한 개선된 Threshold HITS 알고리즘)

  • 김혜민;김민구
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.106-108
    • /
    • 2004
  • 링크 구조를 이용하는 대표적인 알고리즘인 HITS는 링크 정보를 이용하여 Authority와 Hub rating을 구하는 알고리즘이다. 그러나 HITS에서는 중요도와는 관계없이 단순히 링크만을 많이 갖는 page의 Authority와 Hub rating이 비정상적으로 높게 계산되는 문제점이 있어 이를 해결하기 위한 연구들이 있었다. 본 논문에서는 이러한 연구들의 결과를 개선시키기 위해 Authority와 Hub rating의 단순한 합이 아닌, 평균과 priority를 적용하였다. 정확도를 측정하는 실험을 통해 제안하는 알고리즘이 기존의 방법보다 우수한 성능을 나타냄을 알 수 있다.

  • PDF

Localization of 5,105 Hanwoo (Korean Cattle) BAC Clones on Bovine Chromosomes by the Analysis of BAC End Sequences (BESs) Involving 21,024 Clones

  • Choi, Jae Min;Chae, Sung-Hwa;Kang, Se Won;Choi, Dong-Sik;Lee, Yong Seok;Park, Hong-Seog;Yeo, Jung-Sou;Choi, Inho
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.20 no.11
    • /
    • pp.1636-1650
    • /
    • 2007
  • As an initial step toward a better understanding of the genome structure of Korean cattle (Hanwoo breed) and initiation of the framework for genomic research in this bovine, the bacterial artificial chromosome (BAC) end sequencing of 21,024 clones was recently completed. Among these clones, BAC End Sequences (BESs) of 20,158 clones with high quality sequences (Phred score ${\geq}20$, average BES equaled 620 bp and totaled 23,585,814 bp), after editing sequencing results by eliminating vector sequences, were used initially to compare sequence homology with the known bovine chromosomal DNA sequence by using BLASTN analysis. Blast analysis of the BESs against the NCBI Genome database for Bos taurus (Build 2.1) indicated that the BESs from 13,201 clones matched bovine contig sequences with significant blast hits (E<$e^{-40}$), including 7,075 single-end hits and 6,126 paired-end hits. Finally, a total of 5,105 clones of the Korean cattle BAC clones with paired-end hits, including 4,053 clones from the primary analysis and 1,052 clones from the secondary analysis, were mapped to the bovine chromosome with very high accuracy.

Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS

  • Kwon, Ji-Sun;Kim, Ji-Hye;Nam, Doug-U;Kim, Sang-Soo
    • Genomics & Informatics
    • /
    • v.10 no.2
    • /
    • pp.123-127
    • /
    • 2012
  • Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.

Predicting Damage in a Concrete Structure Using Acoustic Emission and Electrical Resistivity for a Low and Intermediate Level Nuclear Waste Repository

  • Hong, Chang-Ho;Kim, Jin-Seop;Lee, Hang-Lo;Cho, Dong-Keun
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.19 no.2
    • /
    • pp.197-204
    • /
    • 2021
  • In this study, the well-known non-destructive acoustic emission (AE) and electrical resistivity methods were employed to predict quantitative damage in the silo structure of the Wolsong Low and Intermediate Level Radioactive Waste Disposal Center (WLDC), Gyeongju, South Korea. Brazilian tensile test was conducted with a fully saturated specimen with a composition identical to that of the WLDC silo concrete. Bi-axial strain gauges, AE sensors, and electrodes were attached to the surface of the specimen to monitor changes. Both the AE hit and electrical resistance values helped in the anticipation of imminent specimen failure, which was further confirmed using a strain gauge. The quantitative damage (or damage variable) was defined according to the AE hits and electrical resistance and analyzed with stress ratio variations. Approximately 75% of the damage occurred when the stress ratio exceeded 0.5. Quantitative damage from AE hits and electrical resistance showed a good correlation (R = 0.988, RMSE = 0.044). This implies that AE and electrical resistivity can be complementarily used for damage assessment of the structure. In future, damage to dry and heated specimens will be examined using AE hits and electrical resistance, and the results will be compared with those from this study.

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.