• Title/Summary/Keyword: ranking-based search

Search Result 110, Processing Time 0.029 seconds

Concept Network-based Personalized Web Search Systems (개념 네트워크 기반 사용자 인지형 웹 검색 시스템)

  • Yune, Hong-June;Noh, Joon-Ho;Kim, Han-Joon;Lee, Byung-Jeong;Kang, Soo-Yong;Chang, Jae-Young
    • Journal of Internet Computing and Services
    • /
    • v.12 no.2
    • /
    • pp.63-73
    • /
    • 2011
  • In general, conventional search engines provide the same search results for the same queries of users, and however such techniques do not consider users' characteristics. To overcome this problem, we need a new way of personalized search which returns customized search results according to users' preference. In this paper, we propose a concept network profile-based personalized web search system in which the concept network is developed for accumulating users' characteristics. The concept network-based user profile is used to expand initial search queries to achieve personalized search. The concept network is a network structure of concepts where each concept is generated whenever each query is submitted, and it can be defined as a set of keywords extracted from the selected documents. Furthermore, we have improved the concept networks by augmenting intent keywords of each concept with a set of classification tags, called folksonomy, assigned to each document. For an additional personalized search technique, we propose a new re-ranking method that analayzes the degree of overlapped search results.

A Case Study on the Next Generation Library Catalogs (차세대 도서관 목록 사례의 고찰)

  • Yoon, Cheong-Ok
    • Journal of Korean Library and Information Science Society
    • /
    • v.41 no.1
    • /
    • pp.5-28
    • /
    • 2010
  • The purpose of this study is to investigate the major features of Next Generation Library Catalogs. 'Next Generation Melvyl Pilot' of University of California Library System and 'SearchWorks' of Stanford University Library are examined. While the former is developed, based on OCLC WorldCat Local, the latter is based on the Blacklight, an Open Source Catalog Software. Both commonly provide the features, including enriched contents, facet navigation, keyword searching, relevancy ranking of search results, and user contribution, etc., but some functions vary in scopes and contents. Also, it seems that both are in process of development rather than complete implementations.

  • PDF

Semantic Web based Information Retrieval System for the automatic integration framework (자동화된 통합 프레임워크를 위한 시맨틱 웹 기반의 정보 검색 시스템)

  • Choi Ok-Kyung;Han Sang-Yong
    • The KIPS Transactions:PartC
    • /
    • v.13C no.1 s.104
    • /
    • pp.129-136
    • /
    • 2006
  • Information Retrieval System aims towards providing fast and accurate information to users. However, current search systems are based on plain svntactic analysis which makes it difficult for the user to find the exact required information. This paper proposes the SW-IRS (Semantic Web-based Information Retrieval System) using an Ontology Server. The proposed system is purposed to maximize efficiency and accuracy of information retrieval of unstructured and semi-structured documents by using an agent-based automatic classification technology and semantic web based information retrieval methods. For interoperability and easy integration, RDF based repository system is supported, and the newly developed ranking algorithm was applied to rank search results and provide more accurate and reliable information. Finally, a new ranking algorithm is suggested to be used to evaluate performance and verify the efficiency and accuracy of the proposed retrieval system.

A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach (시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법)

  • Rho, Sang-Kyu;Park, Hyun-Jung;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.31-59
    • /
    • 2007
  • We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.

Development of Genetic Algorithms for Efficient Constraints Handling (구속조건의 효율적인 처리를 위한 유전자 알고리즘의 개발)

  • Cho, Young-Suk;Choi, Dong-Hoon
    • Proceedings of the KSME Conference
    • /
    • 2000.04a
    • /
    • pp.725-730
    • /
    • 2000
  • Genetic algorithms based on the theory of natural selection, have been applied to many different fields, and have proven to be relatively robust means to search for global optimum and handle discontinuous or even discrete data. Genetic algorithms are widely used for unconstrained optimization problems. However, their application to constrained optimization problems remains unsettled. The most prevalent technique for coping with infeasible solutions is to penalize a population member for constraint violation. But, the weighting of a penalty for a particular problem constraint is usually determined in the heuristic way. Therefore this paper proposes, the effective technique for handling constraints, the ranking penalty method and hybrid genetic algorithms. And this paper proposes dynamic mutation tate to maintain the diversity in population. The effectiveness of the proposed algorithm is tested on several test problems and results are discussed.

  • PDF

Research Productivity in Business and Economics: South Korea, 1990-2016

  • Jin, Jang C.
    • East Asian Economic Review
    • /
    • v.23 no.1
    • /
    • pp.89-107
    • /
    • 2019
  • This paper ranks higher education in Korea based upon research productivity in business and economics disciplines. The number of SCI-level journal articles are tabulated using the Web of Science search engine, over the sample period from 1990 to 2016. The league table shows that many private universities dominate top-tier ranks, which is consistent with the school reputations most commonly cited by the general public in Korea. In contrast, many national universities appear in the second-tier, and their scanty performance in business and economics is in sharp contrast with our earlier findings in which national universities performed well in science and engineering fields (Jin and Kim, 2018). In addition, the ranking order in lower-ranked schools is found to be sensitive to a small change in publications, whereas the publication gap among top-tier schools is relatively large. Finally, unlike our general perception, the size of school does not matter for collaborative research. Some policy implications are discussed as a conclusion.

An Efficient Keyword Search Method on RDF Data (RDF 데이타에 대한 효율적인 검색 기법)

  • Kim, Jin-Ha;Song, In-Chul;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.6
    • /
    • pp.495-504
    • /
    • 2008
  • Recently, there has been much work on supporting keyword search not only for text documents, but a]so for structured data such as relational data, XML data, and RDF data. In this paper, we propose an efficient keyword search method for RDF data. The proposed method first groups related nodes and edges in RDF data graphs to reduce data sizes for efficient keyword search and to allow relevant information to be returned together in the query answers. The proposed method also utilizes the semantics in RDF data to measure the relevancy of nodes and edges with respect to keywords for search result ranking. The experimental results based on real RDF data show that the proposed method reduces RDF data about in half and is at most 5 times faster than the previous methods.

XML-based Modeling for Semantic Retrieval of Syslog Data (Syslog 데이터의 의미론적 검색을 위한 XML 기반의 모델링)

  • Lee Seok-Joon;Shin Dong-Cheon;Park Sei-Kwon
    • The KIPS Transactions:PartD
    • /
    • v.13D no.2 s.105
    • /
    • pp.147-156
    • /
    • 2006
  • Event logging plays increasingly an important role in system and network management, and syslog is a de-facto standard for logging system events. However, due to the semi-structured features of Common Log Format data most studies on log analysis focus on the frequent patterns. The extensible Markup Language can provide a nice representation scheme for structure and search of formatted data found in syslog messages. However, previous XML-formatted schemes and applications for system logging are not suitable for semantic approach such as ranking based search or similarity measurement for log data. In this paper, based on ranked keyword search techniques over XML document, we propose an XML tree structure through a new data modeling approach for syslog data. Finally, we show suitability of proposed structure for semantic retrieval.

Efficient Blog Retrieval System by Topic-based Weighting (주제어 가중치 기법에 의한 효율적인 블로그 검색 시스템)

  • Shin, Hyeon-Il;Yun, Un-Il;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.4
    • /
    • pp.1-9
    • /
    • 2010
  • In the new generation of Web, commonly called "Web 2.0", blogging has facilitated the publishing information or his/her opinion on the web. Various blog retrieval algorithms have been proposed to search for blogs more effectively. However, actually keyword-based searching or link-analysis blog ranking system cannot satisfy the user's requirement. In this paper, we suggest a topic-based weighting blog retrieval system in which the links between blog writings and searching words are considered to improve the search results. Our system extracts topics from each blog and weights them much higher than other guide words. In the comparison with other systems, we see that the proposed topic-base system has better recall rate of search results.

Query Processing Model Using Two-level Fuzzy Knowledge Base (2단계 퍼지 지식베이스를 이용한 질의 처리 모델)

  • Lee, Ki-Young;Kim, Young-Un
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.4 s.36
    • /
    • pp.1-16
    • /
    • 2005
  • When Web-based special retrieval systems for scientific field extremely restrict the expression of user's information request, the process of the information content analysis and that of the information acquisition become inconsistent. Accordingly, this study suggests the re-ranking retrieval model which reflects the content based similarity between user's inquiry terms and index words by grasping the document knowledge structure. In order to accomplish this, the former constructs a thesaurus and similarity relation matrix to provide the subject analysis mechanism and the latter propose the algorithm which establishes a search model such as query expansion in order to analyze the user's demands. Therefore, the algorithm that this study suggests as retrieval utilizing the information structure of a retrieval system can be content-based retrieval mechanism to establish a 2-step search model for the preservation of recall and improvement of accuracy which was a weak point of the previous fuzzy retrieval model.

  • PDF