• Title/Summary/Keyword: Graph Retrieval

Search Result 55, Processing Time 0.019 seconds

The Scheme for Path-based Query Processing on the Semantic Data (시맨틱 웹 데이터의 경로 기반 질의 처리 기법)

  • Kim, Youn-Hee;Kim, Jee-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.10
    • /
    • pp.31-41
    • /
    • 2009
  • In the Semantic Web, it is possible to provide intelligent information retrieval and automated web services by defining a concept of information resource and representing a semantic relation between resources with meta data and ontology. It is very important to manage semantic data such as ontology and meta data efficiently for implementing essential functions of the Semantic Web. Thus we propose an index structure to support more accurate search results and efficient query processing by considering semantic and structural features of the semantic data. Especially we use a graph data model to express semantic and structural features of the semantic data and process various type of queries by using graph model based path expressions. In this paper the proposed index aims to distinguish our approach from earlier studies and involve the concept of the Semantic Web in its entirety by querying on primarily extracted structural path information and secondary extracted one through semantic inferences with ontology. In the experiments, we show that our approach is more accurate and efficient than the previous approaches and can be applicable to various applications in the Semantic Web.

Metadata-Based Data Structure Analysis to Optimize Search Speed and Memory Efficiency (검색 속도와 메모리 효율 최적화를 위한 메타데이터 기반 데이터 구조 분석)

  • Kim Se Yeon;Lim Young Hoon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.311-318
    • /
    • 2024
  • As the amount of data increases due to the development of artificial intelligence and the Internet, data management is becoming increasingly important, and the efficient utilization of data retrieval and memory space is crucial. In this study, we investigate how to optimize search speed and memory efficiency by analyzing data structure based on metadata. As a research method, we compared and analyzed the performance of the array, association list, dictionary binary tree, and graph data structures using metadata of photographic images, focusing on temporal and space complexity. Through experimentation, it was confirmed that dictionary data structure performs best in collection speed and graph data structure performs best in search speed when dealing with large-scale image data. We expect the results of this paper to provide practical guidelines for selecting data structures to optimize search speed and memory efficiency for the images data.

A Parameter-Free Approach for Clustering and Outlier Detection in Image Databases (이미지 데이터베이스에서 매개변수를 필요로 하지 않는 클러스터링 및 아웃라이어 검출 방법)

  • Oh, Hyun-Kyo;Yoon, Seok-Ho;Kim, Sang-Wook
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.80-91
    • /
    • 2010
  • As the volume of image data increases dramatically, its good organization of image data is crucial for efficient image retrieval. Clustering is a typical way of organizing image data. However, traditional clustering methods have a difficulty of requiring a user to provide the number of clusters as a parameter before clustering. In this paper, we discuss an approach for clustering image data that does not require the parameter. Basically, the proposed approach is based on Cross-Association that finds a structure or patterns hidden in data using the relationship between individual objects. In order to apply Cross-Association to clustering of image data, we convert the image data into a graph first. Then, we perform Cross-Association on the graph thus obtained and interpret the results in the clustering perspective. We also propose the method of hierarchical clustering and the method of outlier detection based on Cross-Association. By performing a series of experiments, we verify the effectiveness of the proposed approach. Finally, we discuss the finding of a good value of k used in k-nearest neighbor search and also compare the clustering results with symmetric and asymmetric ways used in building a graph.

Integration of Component Image Information and Design Information by Graph to Support Product Design Information Reuse (제품 설계 정보 재사용을 위한 그래프 기반의 부품 영상 정보와 설계 정보의 병합)

  • Lee, Hyung-Jae;Yang, Hyung-Jeong;Kim, Kyoung-Yun;Kim, Soo-Hyung;Kim, Sun-Hee
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.1017-1026
    • /
    • 2006
  • Recently, distributed collaborative development environment has been recognized an alternative environment for product development in which multidisciplinary participants are naturally involving. Reuse of Product design information has long been recognized as one of core requirements for efficient product development. This paper addresses an image-based retrieval system to support product design information reuse. In the system, product images obtained from multi-modal devices are utilized to reuse design information. The proposed system conducts the segmentation of a product image by using a labeling method and generates an attributed relational graph (ARG) that represents properties of segmented regions and their relationships. The generated ARG is extended by integrating corresponding part/assembly information. In this manner, the reuse of assembly design information using a product image has been realized. The main advantages of the presented system are following. First, the system is not dependent to specific design tools, because it utilizes multimedia images that can be obtained easily from peripheral devices. Second ratio-based features extracted from images enable image retrievals that contain various sizes of parts. Third, the system has shown outstanding search performance, because we applied various information of segmented part regions and their relationships between parts.

Domain Question Answering System (도메인 질의응답 시스템)

  • Yoon, Seunghyun;Rhim, Eunhee;Kim, Deokho
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.2
    • /
    • pp.144-147
    • /
    • 2015
  • Question Answering (QA) services can provide exact answers to user questions written in natural language form. This research focuses on how to build a QA system for a specific domain area. Online and offline QA system architecture of targeted domain such as domain detection, question analysis, reasoning, information retrieval, filtering, answer extraction, re-ranking, and answer generation, as well as data preparation are presented herein. Test results with an official Frequently Asked Question (FAQ) set showed 68% accuracy of the top 1 and 77% accuracy of the top 5. The contribution of each part such as question analysis system, document search engine, knowledge graph engine and re-ranking module for achieving the final answer are also presented.

Join Query Performance Optimization Based on Convergence Indexing Method (융합 인덱싱 방법에 의한 조인 쿼리 성능 최적화)

  • Zhao, Tianyi;Lee, Yong-Ju
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.1
    • /
    • pp.109-116
    • /
    • 2021
  • Since RDF (Resource Description Framework) triples are modeled as graph, we cannot directly adopt existing solutions in relational databases and XML technology. In order to store, index, and query Linked Data more efficiently, we propose a convergence indexing method combined R*-tree and K-dimensional trees. This method uses a hybrid storage system based on HDD (Hard Disk Drive) and SSD (Solid State Drive) devices, and a separated filter and refinement index structure to filter unnecessary data and further refine the immediate result. We perform performance comparisons based on three standard join retrieval algorithms. The experimental results demonstrate that our method has achieved remarkable performance compared to other existing methods such as Quad and Darq.

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.

Automated Development of Rank-Based Concept Hierarchical Structures using Wikipedia Links (위키피디아 링크를 이용한 랭크 기반 개념 계층구조의 자동 구축)

  • Lee, Ga-hee;Kim, Han-joon
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.61-76
    • /
    • 2015
  • In general, we have utilized the hierarchical concept tree as a crucial data structure for indexing huge amount of textual data. This paper proposes a generality rank-based method that can automatically develop hierarchical concept structures with the Wikipedia data. The goal of the method is to regard each of Wikipedia articles as a concept and to generate hierarchical relationships among concepts. In order to estimate the generality of concepts, we have devised a special ranking function that mainly uses the number of hyperlinks among Wikipedia articles. The ranking function is effectively used for computing the probabilistic subsumption among concepts, which allows to generate relatively more stable hierarchical structures. Eventually, a set of concept pairs with hierarchical relationship is visualized as a DAG (directed acyclic graph). Through the empirical analysis using the concept hierarchy of Open Directory Project, we proved that the proposed method outperforms a representative baseline method and it can automatically extract concept hierarchies with high accuracy.

Topic-Network based Topic Shift Detection on Twitter (트위터 데이터를 이용한 네트워크 기반 토픽 변화 추적 연구)

  • Jin, Seol A;Heo, Go Eun;Jeong, Yoo Kyung;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.1
    • /
    • pp.285-302
    • /
    • 2013
  • This study identified topic shifts and patterns over time by analyzing an enormous amount of Twitter data whose characteristics are high accessibility and briefness. First, we extracted keywords for a certain product and used them for representing the topic network allows for intuitive understanding of keywords associated with topics by nodes and edges by co-word analysis. We conducted temporal analysis of term co-occurrence as well as topic modeling to examine the results of network analysis. In addition, the results of comparing topic shifts on Twitter with the corresponding retrieval results from newspapers confirm that Twitter makes immediate responses to news media and spreads the negative issues out quickly. Our findings may suggest that companies utilize the proposed technique to identify public's negative opinions as quickly as possible and to apply for the timely decision making and effective responses to their customers.

Hypertext Model Extension and Dynamic Server Allocation for Database Gateway in Web Database Systems (웹 데이타베이스에서 하이퍼텍스트 모델 확장 및 데이타베이스 게이트웨이의 동적 서버 할당)

  • Shin, Pan-Seop;Kim, Sung-Wan;Lim, Hae-Chull
    • Journal of KIISE:Databases
    • /
    • v.27 no.2
    • /
    • pp.227-237
    • /
    • 2000
  • A Web database System is a large-scaled multimedia application system that has multimedia processing facilities and cooperates with relational/Object-Oriented DBMS. Conventional hypertext modeling methods and DB gateway have limitations for Web database because of their restricted versatile presentation abilities and inefficient concurrency control caused by bottleneck in cooperation processing. Thus, we suggest a Dynamic Navigation Model & Virtual Graph Structure. The Dynamic Navigation Model supports implicit query processing and dynamic creation of navigation spaces, and introduce node-link creation rule considering navigation styles. We propose a mapping methodology between the suggested hypertext model and the relational data model, and suggest a dynamic allocation scheduling technique for query processing server based on weighted value. We show that the proposed technique enhances the retrieval performance of Web database systems in processing complex queries concurrently.

  • PDF