• 제목/요약/키워드: Indexing Model

Search Result 169, Processing Time 0.021 seconds

An Indexing Model for Efficient Structure Retrieval of XML Documents (XML 문서의 효율적인 구조 검색을 위한 색인 모델)

  • Park, Jong-Gwan;Son, Chung-Beom;Gang, Hyeong-Il;Yu, Jae-Su;Lee, Byeong-Yeop
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.451-460
    • /
    • 2001
  • In this paper, we propose an indexing model for efficient structure retrieval of XML documents. The proposed indexing model consists of structured information that supports a wide range of queries such as content-based queries and structure-attribute queries at all levels of the document hierarchy and index organizations that are constructed based on the information. To support structured retrieval, a new representation method for structured information is presented. Using this structured information, we design content index, structure index, and attribute index for efficient retrieval. also, we explain processing procedures for mixed queries and evaluate the performance of proposed indexing model. It is shown that the proposed indexing model achieves better retrieval performance than the existing method.

  • PDF

A Study on Semantic Based Indexing and Fuzzy Relevance Model (의미기반 인덱스 추출과 퍼지검색 모델에 관한 연구)

  • Kang, Bo-Yeong;Kim, Dae-Won;Gu, Sang-Ok;Lee, Sang-Jo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.238-240
    • /
    • 2002
  • If there is an Information Retrieval system which comprehends the semantic content of documents and knows the preference of users. the system can search the information better on the Internet, or improve the IR performance. Therefore we propose the IR model which combines semantic based indexing and fuzzy relevance model. In addition to the statistical approach, we chose the semantic approach in indexing, lexical chains, because we assume it would improve the performance of the index term extraction. Furthermore, we combined the semantic based indexing with the fuzzy model, which finds out the exact relevance of the user preference and index terms. The proposed system works as follows: First, the presented system indexes documents by the efficient index term extraction method using lexical chains. And then, if a user tends to retrieve the information from the indexed document collection, the extended IR model calculates and ranks the relevance of user query. user preference and index terms by some metrics. When we experimented each module, semantic based indexing and extended fuzzy model. it gave noticeable results. The combination of these modules is expected to improve the information retrieval performance.

  • PDF

PDFindexer: Distributed PDF Indexing system using MapReduce

  • Murtazaev, JAziz;Kihm, Jang-Su;Oh, Sangyoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.4 no.1
    • /
    • pp.13-17
    • /
    • 2012
  • Indexing allows converting raw document collection into easily searchable representation. Web searching by Google or Yahoo provides subsecond response time which is made possible by efficient indexing of web-pages over the entire Web. Indexing process gets challenging when the scale gets bigger. Parallel techniques, such as MapReduce framework can assist in efficient large-scale indexing process. In this paper we propose PDFindexer, system for indexing scientific papers in PDF using MapReduce programming model. Unlike Web search engines, our target domain is scientific papers, which has pre-defined structure, such as title, abstract, sections, references. Our proposed system enables parsing scientific papers in PDF recreating their structure and performing efficient distributed indexing with MapReduce framework in a cluster of nodes. We provide the overview of the system, their components and interactions among them. We discuss some issues related with the design of the system and usage of MapReduce in parsing and indexing of large document collection.

Development of an Indexing Model for Korean Textual Databases (국내 문자정보 데이터베이스의 색인에 관한 연구)

  • 정영미
    • Journal of the Korean Society for information Management
    • /
    • v.13 no.1
    • /
    • pp.19-43
    • /
    • 1996
  • The indexing languages and techniques were ~ u ~ e y e d for Korean textual databases, and retrieval effectivenesses of two indexing languages were evaluated in an online searching experiment. It was found that most of the Korean textual databases surveyed employ natural language indexing by either an automatic or a manual method, and that natural language indexing may outperform controlled language indexing if appropriate search strategies are employed.

  • PDF

Index Ontology Repository for Video Contents (비디오 콘텐츠를 위한 색인 온톨로지 저장소)

  • Hwang, Woo-Yeon;Yang, Jung-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.10
    • /
    • pp.1499-1507
    • /
    • 2009
  • With the abundance of digital contents, the necessity of precise indexing technology is consistently required. To meet these requirements, the intelligent software entity needs to be the subject of information retrieval and the interoperability among intelligent entities including human must be supported. In this paper, we analyze the unifying framework for multi-modality indexing that Snoek and Worring proposed. Our work investigates the method of improving the authenticity of indexing information in contents-based automated indexing techniques. It supports the creation and control of abstracted high-level indexing information through ontological concepts of Semantic Web skills. Moreover, it attempts to present the fundamental model that allows interoperability between human and machine and between machine and machine. The memory-residence model of processing ontology is inappropriate in order to take-in an enormous amount of indexing information. The use of ontology repository and inference engine is required for consistent retrieval and reasoning of logically expressed knowledge. Our work presents an experiment for storing and retrieving the designed knowledge by using the Minerva ontology repository, which demonstrates satisfied techniques and efficient requirements. At last, the efficient indexing possibility with related research is also considered.

  • PDF

Automated Essay Grading: An Application For Historical Malay Text

  • Syed Mustapha, S.M.F.D;Idris, N.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.237-245
    • /
    • 2001
  • Automated essay grading has been proposed for over thirty years. Only recently have practical implementations been constructed and tested. This paper investigated the role of the nearest-neighbour algorithm within the information retrieval as a way of grading the essay automatically called Automated Essay Grading System. It intended to offer teachers an individualized assistance in grading the student\`s essay. The system involved several processes, which are the indexing, the structuring of the model answer and the grade processing. The indexing process comprised the document indexing and query processing which are mainly used for representing the documents and the query. Structuring the model answer is actually preparing the marking scheme and the grade processing is the process of assessing the essay. To test the effectiveness of the developed algorithms, the algorithms are tested against the History text in Malay. The result showed that th information retrieval and the nearest-neighbour algorithm are practical combination that offer acceptable performance for grading the essay.

  • PDF

Three-dimensional object recognition using efficient indexing:Part I-bayesian indexing (효율적인 인덱싱 기법을 이용한 3차원 물체 인식:Part I-Bayesian 인덱싱)

  • 이준호
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.67-75
    • /
    • 1997
  • A design for a system to perform rapid recognition of three dimensional objects is presented, focusing on efficient indexing. In order to retrieve the best matched models without exploring all possible object matches, we have employed a bayesian framework. A decision-theoretic measure of the discriminatory power of a feature for a model object is defined in terms of posterior probability. Detectability of a featrue defined as a function of the feature itselt, viewpoint, sensor charcteristics, nd the feature detection algorithm(s) is also considered in the computation of discribminatory power. In order to speed up the indexing or selection of correct objects, we generate and verify the object hypotheses for rfeatures detected in a scene in the order of the discriminatory power of these features for model objects.

  • PDF

An Efficient Dynamic Indexing Model for Various Structure Retrievals of XML Documents (XML 문서의 다양한 구조 검색을 위한 효율적인 동적 색인 모델)

  • 신승호;손충범;강형일;유재수
    • Journal of KIISE:Databases
    • /
    • v.31 no.1
    • /
    • pp.48-60
    • /
    • 2004
  • XML documents consist of elements that are basic units of information. When the structure of XML documents is changed dynamically, we need to update structure information efficiently without changing the information of the index structure for fast retrieval. In this paper, we propose a dynamic indexing model scheme that updates the index structure in real time as the structure of XML documents is changed by insertion and deletion of elements. Our dynamic indexing model consists of a structure information representation method and a dynamic index structure. The structure information representation method supports various types of structure retrievals. Our dynamic index structure processes various structural queries efficiently. We show through various experiments that our method outperforms existing ones in processing various types of queries such as content based queries, structural queries and hybrid queries.

Keywords and Spatial Based Indexing for Searching the Things on Web

  • Faheem, Muhammad R.;Anees, Tayyaba;Hussain, Muzammil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1489-1515
    • /
    • 2022
  • The number of interconnected real-world devices such as sensors, actuators, and physical devices has increased with the advancement of technology. Due to this advancement, users face difficulties searching for the location of these devices, and the central issue is the findability of Things. In the WoT environment, keyword-based and geospatial searching approaches are used to locate these devices anywhere and on the web interface. A few static methods of indexing and ranking are discussed in the literature, but they are not suitable for finding devices dynamically. The authors have proposed a mechanism for dynamic and efficient searching of the devices in this paper. Indexing and ranking approaches can improve dynamic searching in different ways. The present paper has focused on indexing for improving dynamic searching and has indexed the Things Description in Solr. This paper presents the Things Description according to the model of W3C JSON-LD along with the open-access APIs. Search efficiency can be analyzed with query response timings, and the accuracy of response timings is critical for search results. Therefore, in this paper, the authors have evaluated their approach by analyzing the search query response timings and the accuracy of their search results. This study utilized different indexing approaches such as key-words-based, spatial, and hybrid. Results indicate that response time and accuracy are better with the hybrid approach than with keyword-based and spatial indexing approaches.

Three-dimensional object recognition using efficient indexing:Part II-generation and verification of object hypotheses (효율적인 인덱싱 기법을 이용한 3차원 물체인식:Part II-물체에 대한 가설의 생성과 검증)

  • 이준호
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.76-88
    • /
    • 1997
  • Based on the principles described in Part I, we have implemented a working prototype vision system using a feature structure called an LSG (local surface group) for generating object hypotheses. In order to verify an object hypothesis, we estimate the view of the hypothesized model object and render the model object for the computed view. The object hypothesis is then verified by finding additional features in the scene that match those present in the rendered image. Experimental results on synthetic and real range images show the effectiveness of the indexing scheme.

  • PDF