• Title/Summary/Keyword: 자동정보 추출

Search Result 1,996, Processing Time 0.028 seconds

Postal Envelope Image Recognition System for Postal Automation (서장 우편물 자동처리를 위한 우편영상 인식 시스템)

  • Kim, Ho-Yon;Lim, Kil-Taek;Kim, Doo-Sik;Nam, Yun-Seok
    • The KIPS Transactions:PartB
    • /
    • v.10B no.4
    • /
    • pp.429-442
    • /
    • 2003
  • In this paper, we describe an address image recognition system for automatic processing of standard- size letter mail. The inputs to the system are gray-level mail piece images and the outputs are delivery point codes with which a delivery sequence of carrier can be generated. The system includes five main modules; destination address block location, text line separation, character segmentation, character recognition and finally address interpretation. The destination address block is extracted on the basis of experimental knowledge and the line separation and character segmentation is done through the analysis of connected components and vortical runs. For recognizing characters, we developed MLP-based recognizers and dynamical programming technique for interpretation. Since each module has been implemented in an independent way, the system has a benefit that the optimization of each module is relatively easy. We have done the experiment with live mail piece images directly sampled from mail sorting machine in Yuseong post office. The experimental results prove the feasibility of our system.

Misclassified Area Detection Algorithm for Aerial LiDAR Digital Terrain Data (항공 라이다 수치지면자료의 오분류 영역 탐지 알고리즘)

  • Kim, Min-Chul;Noh, Myoung-Jong;Cho, Woo-Sug;Bang, Ki-In;Park, Jun-Ku
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.19 no.1
    • /
    • pp.79-86
    • /
    • 2011
  • Recently, aerial laser scanning technology has received full attention in constructing DEM(Digital Elevation Model). It is well known that the quality of DEM is mostly influenced by the accuracy of DTD(Digital Terrain Data) extracted from LiDAR(Light Detection And Ranging) raw data. However, there are always misclassified data in the DTD generated by automatic filtering process due to the limitation of automatic filtering algorithm and intrinsic property of LiDAR raw data. In order to eliminate the misclassified data, a manual filtering process is performed right after automatic filtering process. In this study, an algorithm that detects automatically possible misclassified data included in the DTD from automatic filtering process is proposed, which will reduce the load of manual filtering process. The algorithm runs on 2D grid data structure and makes use of several parameters such as 'Slope Angle', 'Slope DeltaH' and 'NNMaxDH(Nearest Neighbor Max Delta Height)'. The experimental results show that the proposed algorithm quite well detected the misclassified data regardless of the terrain type and LiDAR point density.

MORPHEUS: A More Scalable Comparison-Shopping Agent (MORPHEUS: 확장성이 있는 비교 쇼핑 에이전트)

  • Yang, Jae-Yeong;Kim, Tae-Hyeong;Choe, Jung-Min
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.2
    • /
    • pp.179-191
    • /
    • 2001
  • Comparison shopping is a merchant brokering process that finds the best price for the desired product from several Web-based online stores. To get a scalable comparison shopper, we need an agent that automatically constructs a simple information extraction procedure, called a wrapper, for each semi-structured store. Automatic construction of wrappers for HTML-based Web stores is difficult because HTML only defines how information is to be displayed, not what it means, and different stores employ different ways of manipulating customer queries and different presentation formats for displaying product descriptions. Wrapper induction has been suggested as a promising strategy for overcoming this heterogeneity. However, previous scalable comparison-shoppers such as ShopBot rely on a strong bias in the product descriptions, and as a result, many stores that do not confirm to this bias were unable to be recognized. This paper proposes a more scalable comparison-shopping agent named MORPHEUS. MORPHEUS presents a simple but robust inductive learning algorithm that antomatically constructs wrappers. The main idea of the proposed algorithm is to recognize the position and the structure of a product description unit by finding the most frequent pattern from the sequence of logical line information in output HTML pages. MORPHEUS successfully constructs correct wtappers for most stores by weakening a bias assumed in previous systems. It also tolerates some noises that might be present in production descriptions such as missing attributes. MORPHEUS generates the wrappers rapidly by excluding the pre-processing phase of removing redundant fragments in a page such as a header, a tailer, and advertisements. Eventually, MORPHEUS provides a framework from which a customized comparison-shopping agent can be organized for a user by facilitating the dynamic addition of new stores.

  • PDF

A Comparative Analysis of Content-based Music Retrieval Systems (내용기반 음악검색 시스템의 비교 분석)

  • Ro, Jung-Soon
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.23-48
    • /
    • 2013
  • This study compared and analyzed 15 CBMR (Content-based Music Retrieval) systems accessible on the web in terms of DB size and type, query type, access point, input and output type, and search functions, with reviewing features of music information and techniques used for transforming or transcribing of music sources, extracting and segmenting melodies, extracting and indexing features of music, and matching algorithms for CBMR systems. Application of text information retrieval techniques such as inverted indexing, N-gram indexing, Boolean search, truncation, keyword and phrase search, normalization, filtering, browsing, exact matching, similarity measure using edit distance, sorting, etc. to enhancing the CBMR; effort for increasing DB size and usability; and problems in extracting melodies, deleting stop notes in queries, and using solfege as pitch information were found as the results of analysis.

A Semi-Automatic Building Modeling System Using a Single Satellite Image (단일 위성 영상 기반의 반자동 건물 모델링 시스템)

  • Oh, Seon-Ho;Jang, Kyung-Ho;Jung, Soon-Ki
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.451-462
    • /
    • 2009
  • The spread of satellite image increases various services using it. Especially, 3D visualization services of the whole earth such as $Google\;Earth^{TM}$ and $Virtual\;Earth^{TM}$ or 3D GIS services for several cities provide realistic geometry information of buildings and terrain of wide areas. These service can be used in the various fields such as urban planning, improvement of roads, entertainment, military simulation and emergency response. The research about extracting the building and terrain information effectively from the high-resolution satellite image is required. In this paper, presents a system for effective extraction of the building model from a single high-resolution satellite image, after examine requirements for building model extraction. The proposed system utilizes geometric features of satellite image and the geometric relationship among the building, the shadow of the building, the positions of the sun and the satellite to minimize user interaction. Finally, after extracting the 3D building, the fact that effective extraction of the model from single high-resolution satellite will be show.

Terminology Tagging System using elements of Korean Encyclopedia (백과사전 기반 전문용어 태깅 시스템)

A Study on the Knowledge-Based System for Automaic Abstracting (자동 초록을 위한 지식 기반 시스템 설계에 관한 연구)

  • 최인숙
    • Journal of the Korean Society for information Management
    • /
    • v.6 no.1
    • /
    • pp.93-117
    • /
    • 1989
  • The objective of this study is to design an automatic abstracting system through the analysis of natural language texts. For this purpose a knowledge-based system operating on the basis of domain knowledge was developed. The procedure of generating an abstract consists of three steps: (1) A knowledge-base containing domain knowledge necessary to understand a text is constructed using frame and semantic network structures,and preliminary abstracts are prepared for various cases. (2) Input text is analysed on the basis of domain knowledge in order to extract information filling slots of the abstract with. (3) A Preliminary abstract corresponding to the input text is called and filled with the information, completing the abstract.

  • PDF

Video Object Extraction Using Contour Information (윤곽선 정보를 이용한 동영상에서의 객체 추출)

  • Kim, Jae-Kwang;Lee, Jae-Ho;Kim, Chang-Ick
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.1
    • /
    • pp.33-45
    • /
    • 2011
  • In this paper, we present a method for extracting video objects efficiently by using the modified graph cut algorithm based on contour information. First, we extract objects at the first frame by an automatic object extraction algorithm or the user interaction. To estimate the objects' contours at the current frame, motion information of objects' contour in the previous frame is analyzed. Block-based histogram back-projection is conducted along the estimated contour point. Each color model of objects and background can be generated from back-projection images. The probabilities of links between neighboring pixels are decided by the logarithmic based distance transform map obtained from the estimated contour image. Energy of the graph is defined by predefined color models and logarithmic distance transform map. Finally, the object is extracted by minimizing the energy. Experimental results of various test images show that our algorithm works more accurately than other methods.

Automatic Product Feature Extraction for Efficient Analysis of Product Reviews Using Term Statistics (효율적인 상품평 분석을 위한 어휘 통계 정보 기반 평가 항목 추출 시스템)

  • Lee, Woo-Chul;Lee, Hyun-Ah;Lee, Kong-Joo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.497-502
    • /
    • 2009
  • In this paper, we introduce an automatic product feature extracting system that improves the efficiency of product review analysis. Our system consists of 2 parts: a review collection and correction part and a product feature extraction part. The former part collects reviews from internet shopping malls and revises spoken style or ungrammatical sentences. In the latter part, product features that mean items that can be used as evaluation criteria like 'size' and 'style' for a skirt are automatically extracted by utilizing term statistics in reviews and web documents on the Internet. We choose nouns in reviews as candidates for product features, and calculate degree of association between candidate nouns and products by combining inner association degree and outer association degree. Inner association degree is calculated from noun frequency in reviews and outer association degree is calculated from co-occurrence frequency of a candidate noun and a product name in web documents. In evaluation results, our extraction method showed an average recall of 90%, which is better than the results of previous approaches.

Feature Extraction of Web Document using Association Word Mining (연관 단어 마이닝을 사용한 웹문서의 특징 추출)

  • 고수정;최준혁;이정현
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.351-361
    • /
    • 2003
  • The previous studies to extract features for document through word association have the problems of updating profiles periodically, dealing with noun phrases, and calculating the probability for indices. We propose more effective feature extraction method which is using association word mining. The association word mining method, by using Apriori algorithm, represents a feature for document as not single words but association-word-vectors. Association words extracted from document by Apriori algorithm depend on confidence, support, and the number of composed words. This paper proposes an effective method to determine confidence, support, and the number of words composing association words. Since the feature extraction method using association word mining does not use the profile, it need not update the profile, and automatically generates noun phrase by using confidence and support at Apriori algorithm without calculating the probability for index. We apply the proposed method to document classification using Naive Bayes classifier, and compare it with methods of information gain and TFㆍIDF. Besides, we compare the method proposed in this paper with document classification methods using index association and word association based on the model of probability, respectively.