• Title/Summary/Keyword: Information Retrieval Engine

Search Result 137, Processing Time 0.029 seconds

Design and Implementation of Information Retrieval System Based on Ontology Using Semantic Web (시맨틱 웹을 이용한 온톨로지 기반의 정보검색 시스템 설계 및 구현)

  • Seo, Woo-Jin;Rhyu, Kyeong-Taek
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.209-217
    • /
    • 2019
  • In this paper, the purpose of this paper is to lay the foundation for the search system by using and building an online search engine suitable for the search domain and enabling search, conversion, integration and sharing of information. It is to use the ontology to infer hierarchical relationships, deduce objects based on that layer, and extract attributes to search areas that are relevant to the data that the user wants. In order to search for information in this way, the information search system was implemented by entering key words related to 'qualifications'. The implemented system arranged the meaning and relationship of each attribute online so that the general public can search information quickly, easily, and accurately. In addition, the implementation results were compared with two different search engines. Comparable search engines are Naver and Daum, the two major search engines. The search engine of this study, which was built using an ontology suitable for the search domain to perform searches using the semantic web, was evaluated to have excellent results. However, it is thought that a more formalized online location is necessary to increase the accuracy and reliability of search engines and to include more comprehensive categories of search terms.

Effective Picture Search in Lifelog Management Systems using Bluetooth Devices (라이프로그 관리 시스템에서 블루투스 장치를 이용한 효과적인 사진 검색 방법)

  • Chung, Eun-Ho;Lee, Ki-Yong;Kim, Myoung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.383-391
    • /
    • 2010
  • A Lifelog management system provides users with services to store, manage, and search their life logs. This paper proposes a fully-automatic collecting method of real world social contacts and lifelog search engine using collected social contact information as keyword. Wireless short-distance network devices in mobile phones are used to detect social contacts of their users. Human-Bluetooth relationship matrix is built based on the frequency of a human-being and a Bluetooth device being observed at the same time. Results show that with 20% of social contact information out of full social contact information of the observation times used for calculation, 90% of human-Bluetooth relationship can be correctly acquired. A lifelog search-engine that takes human names as keyword is suggested which compares two vectors, a row of Human-Bluetooth matrix and a vector of Bluetooth list scanned while a lifelog was created, using vector information retrieval model. This search engine returns more lifelog than existing text-matching search engine and ranks the result unlike existing search-engine.

Examining Categorical Transition and Query Reformulation Patterns in Image Search Process (이미지 검색 과정에 나타난 질의 전환 및 재구성 패턴에 관한 연구)

  • Chung, Eun-Kyung;Yoon, Jung-Won
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.2
    • /
    • pp.37-60
    • /
    • 2010
  • The purpose of this study is to investigate image search query reformulation patterns in relation to image attribute categories. A total of 592 sessions and 2,445 queries from the Excite Web search engine log data were analyzed by utilizing Batley's visual information types and two facets and seven sub-facets of query reformulation patterns. The results of this study are organized with two folds: query reformulation and categorical transition. As the most dominant categories of queries are specific and general/nameable, this tendency stays over various search stages. From the perspective of reformulation patterns, while the Parallel movement is the most dominant, there are slight differences depending on initial or preceding query categories. In examining categorical transitions, it was found that 60-80% of search queries were reformulated within the same categories of image attributes. These findings may be applied to practice and implementation of image retrieval systems in terms of assisting users' query term selection and effective thesauri development.

A XML-based Metadata Engine Design for Effective Retrieval in PVR System (PVR 시스템에서 효율적인 검색을 위한 XML 메타데이터 엔진설계)

  • 신은영;박성한
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.574-576
    • /
    • 2004
  • 디지털 방송과 함께 저장매체를 갖는 PVR과 셋탑박스가 출현하였지만 방대한 컨텐츠에 대한 선택의 어려움이 발생하였다. 이러한 문제를 해결하기 위해서 PVR에서는 TV-Anytime과 MPEG-7 표준을 기반으로 멀티미디어 데이터에 대한 메타데이터를 제공한다. 이 메타데이터는 멀티미디어 데이터를 표현하는 특징적인 정보를 포함하고 있어, 컨텐츠에 대한 선택과 검색을 돕는다. 그러나 메타데이터는 그 내용이 방대한 XML document로 구성되어 있어, 효율적이고 빠른 검색이 쉽지 않다. 본 논문은 이러한 XML 메타데이터의 특성을 기반으로 효율적인 검색을 위한 XML 메타데이터 엔진을 설계한다. 제안하는 XML 메타데이터 엔진은 메타데이터의 정보적 특성을 기반으로 인덱싱 구조를 설계하여 XML 메타데이터의 접근 시간을 최소화한다.

  • PDF

HyREX: Universal XML Retrieval Engine for XML (다국어를 지원하는 XML 문서 검색 시스템: HyREX)

  • Han, Ye-Ji;Chae, Jong-Dae;Kim, Su-Hee
    • Annual Conference of KIPS
    • /
    • 2002.11c
    • /
    • pp.1713-1716
    • /
    • 2002
  • HyREX는 연구용 프로토타입 XML 하이퍼미디어 문서 검색시스템으로 다국어를 지원하고 있다. HyREX는 검색을 위한 효율적인 접근 경로들을 처리하는 물리적 계층 HyPath와 질의어를 처리하는 논리적 계층 XIRQL 그리고 사용자 인터페이스인 HyGate 계층으로 이루어져 있다. 이 연구에서는 영어와 독일어 등의 검색을 지원하는 기존의 HyREX 시스템을 한글 XML 문서 검색시스템으로 확장하기 위해 먼저 한글 데이터타입을 위한 클래스를 구현하였다. 앞으로 한글 XML 문서 검색에서 정확율과 재현율을 향상하기 위해 각 문서의 인덱스에 대해 $tf{\cdot}idf$ 공식을 이용하여 가중치를 부여하고 이를 개발하고자 한다.

  • PDF

An Implementation of Web-Based Korean Language Information Retrieval System (웹기반 한글정보검색시스템의 구현)

  • Hong, G.C.;Chung, H.S.
    • Electronics and Telecommunications Trends
    • /
    • v.14 no.6 s.60
    • /
    • pp.9-21
    • /
    • 1999
  • 최근 인터넷상에는 매일 방대한 양의 정보가 창출되어 유포되고 있으며, 수많은 정보 제공 사이트들이 늘고 있다. 이용자들은 필요한 정보를 찾고 활용하기 위해 야후(Yahoo), 알타비스타(AltaVista) 등 국외 검색엔진(search engine)들과 심마니, 미스 다찾니 등 국내 검색엔진 등 인터넷상에 운용되고 있는 이들 시스템들을 이용하고 있지만, 대부분의 시스템들은 자체 정보 제공보다는 로봇 에이전트를 이용하여 인터넷 사이트에 등록되어 있는 다양한 분야의 홈페이지 정보들을 수집/분석하여 관련 사이트를 연결해주는 방식의 메타 검색엔진들로서 불필요한 정보들까지 제공함에 따라 이용자들이 필요로 하는 정보를 찾기에는 너무 많은 노력과 시간을 소모하게 되는 문제점을 안고 있다. 이에 본 고에서는 형태소 분석 및 시소러스 사전을 이용하여 검색의 정확성 및 재현율 향상을 고려하고, 주제어 중심의 불리언 검색뿐만 아니라 하이퍼텍스트 기반의 주제어 카탈로그 검색, 각기 다른 사이트의 검색엔진들로부터 질의한 결과를 통합하여 제공하는 지능형 통합검색, 이용자 프로파일에 근거하여 최신 업데이트된 정보를 주기적으로 제공해주는 맞춤정보서비스(Selective Dissemination of Information Service: SDI) 등을 통합한 인터넷 기반의 한글 정보검색시스템의 구현에 대한 내용을 기술하고자 한다.

A Proposal of Methods for Extracting Temporal Information of History-related Web Document based on Historical Objects Using Machine Learning Techniques (역사객체 기반의 기계학습 기법을 활용한 웹 문서의 시간정보 추출 방안 제안)

  • Lee, Jun;KWON, YongJin
    • Journal of Internet Computing and Services
    • /
    • v.16 no.4
    • /
    • pp.39-50
    • /
    • 2015
  • In information retrieval process through search engine, some users want to retrieve several documents that are corresponding with specific time period situation. For example, if user wants to search a document that contains the situation before 'Japanese invasions of Korea era', he may use the keyword 'Japanese invasions of Korea' by using searching query. Then, search engine gives all of documents about 'Japanese invasions of Korea' disregarding time period in order. It makes user to do an additional work. In addition, a large percentage of cases which is related to historical documents have different time period between generation date of a document and record time of contents. If time period in document contents can be extracted, it may facilitate effective information for retrieval and various applications. Consequently, we pursue a research extracting time period of Joseon era's historical documents by using historic literature for Joseon era in order to deduct the time period corresponding with document content in this paper. We define historical objects based on historic literature that was collected from web and confirm a possibility of extracting time period of web document by machine learning techniques. In addition to the machine learning techniques, we propose and apply the similarity filtering based on the comparison between the historical objects. Finally, we'll evaluate the result of temporal indexing accuracy and improvement.

Development of a National R&D Knowledge Map Using the Subject-Object Relation based on Ontology (온톨로지 기반의 주제-객체관계를 이용한 국가 R&D 지식맵 구축)

  • Yang, Myung-Seok;Kang, Nam-Kyu;Kim, Yun-Jeong;Choi, Kwang-Nam;Kim, Young-Kuk
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.4
    • /
    • pp.123-142
    • /
    • 2012
  • To develop an intelligent search engine to help users retrieve information effectively, various methods, such as Semantic Web, have been used, An effective retrieval method of such methods uses ontology technology. In this paper, we built National R&D ontology after analyzing National R&D Information in NTIS and then implemented National R&D Knowledge Map to represent and retrieve information of the relationship between object and subject (project, human information, organization, research result) in R&D Ontology. In the National R&D Knowledge Map, center-node is the object selected by users, node is subject, subject's sub-node is user's favorite query in National R&D ontology after analyzing the relationship between object and subject. When a user selects sub-node, the system displays the results from inference engine after making query by SPARQL in National R&D ontology.

Information Retrieval System for R2SS (R2SS 기반의 정보검색 시스템)

  • Hong, Seok-Joo;Park, Young-Bae
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.12
    • /
    • pp.39-51
    • /
    • 2009
  • This study matters the design and implementation of an intelligent information search engine that is based on the $R^2SS$(Reverse Really Simple Syndication). Apart from to the previous method, where the user inputs the RSS address that one intends and obtains limited RSS information, the user just types in the information that one appoints to acquire the RSS information of standard documents that the user is interested among several RSS addresses by a Reverse RSS(Really Simple Syndication) method, which is drawn by the automated RSS address collection server in realtime. Through the proposed $R^2SS$(Really Reverse Simple Syndication) based intelligent information search engine, time can be significantly saved along with obtaining information with good quality, furthermore, it has the effects of having a personal secretary.

A Document Summary System based on Personalized Web Search Systems (개인화 웹 검색 시스템 기반의 문서 요약 시스템)

  • Kim, Dong-Wook;Kang, Soo-Yong;Kim, Han-Joon;Lee, Byung-Jeong;Chang, Jae-Young
    • Journal of Digital Contents Society
    • /
    • v.11 no.3
    • /
    • pp.357-365
    • /
    • 2010
  • Personalized web search engine provides personalized results to users by query expansion, re-ranking or other methods representing user's intention. The personalized result page includes URL, page title and small text fragment of each web document. which is known as snippet. The snippet is the summary of the document which includes the keywords issued by either user or search engine itself. Users can verify the relevancy of the whole document using only the snippet, easily. The document summary (snippet) is an important information which makes users determine whether or not to click the link to the whole document. Hence, if a search engine generates personalized document summaries, it can provide a more satisfactory search results to users. In this paper, we propose a personalized document summary system for personalized web search engines. The proposed system provides increased degree of satisfaction to users with marginal overhead.