• Title/Summary/Keyword: Text Retrieval

Search Result 344, Processing Time 0.023 seconds

Image Based Text Matching Using Local Crowdedness and Hausdorff Distance (지역 밀집도 및 Hausdorff 거리를 이용한 영상기반 텍스트 매칭)

  • Son, Hwa-Jeong;Kim, Ji-Soo;Park, Mi-Seon;Yoo, Jae-Myeong;Kim, Soo-Hyung
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.10
    • /
    • pp.134-142
    • /
    • 2006
  • In this paper, we investigate a Hausdorff distance, which is used for the measurement of image similarity, to see whether it is also effective for document retrieval. The proposed method uses a local crowdedness and a Hausdorff distance to locate text images by determining whether a pair of images scanned at different time comes from the same text or not. To reduce the processing time, which is one of the disadvantages of a Hausdorff distance algorithm, we adopt a local crowdedness for feature point extraction. We apply the proposed method to 190 pairs of the same class and 190 pairs of the different class collected from postal envelop images. The results show that the modified Hausdorff distance proposed in this paper performed well in locating the tort region and calculating the degree of similarity between two images. An improvement of accuracy by 2.7% and 9.0% has been obtained, compared to a binary correlation method and the original Hausdorff distance method, respectively.

  • PDF

Improvement OCR Algorithm for Efficient Book Catalog RetrievalTechnology (효과적인 도서목록 검색을 위한 개선된 OCR알고리즘에 관한 연구)

  • HeWen, HeWen;Baek, Young-Hyun;Moon, Sung-Ryong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.152-159
    • /
    • 2010
  • Existing character recognition algorithm recognize characters in simple conditional. It has the disadvantage that recognition rates often drop drastically when input document image has low quality, rotated text, various font or size text because of external noise or data loss. In this paper, proposes the optical character recognition algorithm which using bicubic interpolation method for the catalog retrieval when the input image has rotated text, blurred, various font and size. In this paper, applied optical character recognition algorithm consist of detection and recognition part. Detection part applied roberts and hausdorff distance algorithm for correct detection the catalog of book. Recognition part applied bicubic interpolation to interpolate data loss due to low quality, various font and size text. By the next time, applied rotation for the bicubic interpolation result image to slant proofreading. Experimental results show that proposal method can effectively improve recognition rate 6% and search-time 1.077s process result.

Relevant Image Retrieval of Korean Documents based on Sentence and Word Importance (문장 및 단어 중요도를 통한 한국어 문서 연관 이미지 검색)

  • Kim, Nam-Gyu;Kang, Shin-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.43-48
    • /
    • 2019
  • While reading text-only documents and finding unknown words, readers will become the focus disturbed and not be able to understand the content of the documents. Because children have little experience, it is difficult to understand correctly if the description in context is unfamiliar or ambiguous. In this paper, in order to help understand the text and increase the interest of the readers, we analyze the texts of documents and select the contents that are considered important, and implement a system that displays the most relevant images automatically from the web and links the texts and the images together. The implementation of the system divides the article into paragraphs, analyzes the text, selects important sentences for each paragraph and the important words that best represent the meaning of the important sentences, searches for images related to the words on the web, and then links the images to each of the previous paragraphs. Experiments have shown how to select important sentences and how to select important words in the sentences. As a result of the experiment, we could get 60% performance by evaluating the accuracy of the relation between three selected images and corresponding important sentences.

A new approach for overlay text detection from complex video scene (새로운 비디오 자막 영역 검출 기법)

  • Kim, Won-Jun;Kim, Chang-Ick
    • Journal of Broadcast Engineering
    • /
    • v.13 no.4
    • /
    • pp.544-553
    • /
    • 2008
  • With the development of video editing technology, there are growing uses of overlay text inserted into video contents to provide viewers with better visual understanding. Since the content of the scene or the editor's intention can be well represented by using inserted text, it is useful for video information retrieval and indexing. Most of the previous approaches are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to localize the overlay text in a video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background a transition map is generated. Then candidate regions are extracted by using the transition map and overlay text is finally determined based on the density of state in each candidate. The proposed method is robust to color, size, position, style, and contrast of overlay text. It is also language free. Text region update between frames is also exploited to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.

Natural Language based Video Retrieval System with Event Analysis of Multi-camera Image Sequence in Office Environment (사무실 환경 내 다중카메라 영상의 이벤트분석을 통한 자연어 기반 동영상 검색시스템)

  • Lim, Soo-Jung;Hong, Jin-Hyuk;Cho, Sung-Bae
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.384-389
    • /
    • 2008
  • Recently, the necessity of systems which effectively store and retrieve video data has increased. Conventional video retrieval systems retrieve data using menus or text based keywords. Due to the lack of information, many video clips are simultaneously searched, and the user must have a certain level of knowledge to utilize the system. In this paper, we suggest a natural language based conversational video retrieval system that reflects users' intentions and includes more information than keyword based queries. This system can also retrieve from events or people to their movements. First, an event database is constructed based on meta-data which are generated by domain analysis for collected video in an office environment. Then, a script database is also constructed based on the query pre-processing and analysis. From that, a method to retrieve a video through a matching technique between natural language queries and answers is suggested and validated through performance and process evaluation for 10 users The natural language based retrieval system has shown its better efficiency in performance and user satisfaction than the menu based retrieval system.

  • PDF

A Study on Contents-based Retrieval using Wavelet (Wavelet을 이용한 내용기반 검색에 관한 연구)

  • 강진석;박재필;나인호;최연성;김장형
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.5
    • /
    • pp.1051-1066
    • /
    • 2000
  • According to the recent advances of digital encoding technologies and computing power, large amounts of multimedia informations such as image, graphic, audio and video are fully used in multimedia systems through Internet. By this, diverse retrieval mechanisms are required for users to search dedicated informations stored in multimedia systems, and especially it is preferred to use contents-based retrieval method rather than text-type keyword retrieval method. In this paper, we propose a new contents-based indexing and searching algorithm which aims to get both high efficiency and high retrieval performance. To achieve these objectives, firstly the proposed algorithm classifies images by a pre-processing process of edge extraction, range division, and multiple filtering, and secondly it searches the target images using spatial and textural characteristics of colors, which are extracted from the previous process, in a image. In addition, we describe the simulation results of search requests and retrieval outputs for several images of company's trade-mark using the proposed contents-based retrieval algorithm based on wavelet.

  • PDF

Semantic Video Retrieval Based On User Preference (사용자 선호도를 고려한 의미기반 비디오 검색)

  • Jung, Min-Young;Park, Sung-Han
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.4
    • /
    • pp.127-133
    • /
    • 2009
  • To ensure access to rapidly growing video collection, video indexing is becoming more and more essential. A database for video should be build for fast searching and extracting the accurate features of video information with more complex characteristics. Moreover, video indexing structure supports efficient retrieval of interesting contents to reflect user preferences. In this paper, we propose semantic video retrieval method based on user preference. Unlikely the previous methods do not consider user preferences. Futhermore, the conventional methods show the result as simple text matching for the user's query that does not supports the semantic search. To overcome these limitations, we develop a method for user preference analysis and present a method of video ontology construction for semantic retrieval. The simulation results show that the proposed algorithm performs better than previous methods in terms of semantic video retrieval based on user preferences.

A Study on Patent Structure in Patent Full-text Retrieval (특허정보 전문검색을 위한 문헌구조화 연구)

  • 권영숙;이두영
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1999.08a
    • /
    • pp.29-32
    • /
    • 1999
  • 특허정보는 일반 과학기술정보와 다른 특성을 가지고 있어 정확성과 최신성이 절대적으로 필요하다. 이와 같은 특허정보의 특성을 고려하여 이용자의 정보요구를 충족시키고 효과적으로 검색할 수 있는 특허정보검색시스템 구축을 위한 기초자료로서 특허문헌구조를 고찰하였다.

  • PDF

A Comparison of Retrieval Effectiveness between Image File and Text File (이미지 화일과 텍스트 화일의 검색효율성 비교)

  • 임영선;이두영
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1996.08a
    • /
    • pp.15-18
    • /
    • 1996
  • 본 논문은 본문 전체가 기계가독형 화일로 구성된 텍스트 전문데이터베이스와 이미지화일로 구성된 이미지 전문데이터베이스와의 검색효율성을 비교함으로써 도서관과 최종이용자의 입장에서 바람직한 전문데이터베이스가 어떤 것인지를 제안하고자 한다.

  • PDF