• Title/Summary/Keyword: Text features

Search Result 581, Processing Time 0.026 seconds

Robust Algorithms for Combining Multiple Term Weighting Vectors for Document Classification

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.2
    • /
    • pp.81-86
    • /
    • 2016
  • Term weighting is a popular technique that effectively weighs the term features to improve accuracy in document classification. While several successful term weighting algorithms have been suggested, none of them appears to perform well consistently across different data domains. In this paper we propose several reasonable methods to combine different term weight vectors to yield a robust document classifier that performs consistently well on diverse datasets. Specifically we suggest two approaches: i) learning a single weight vector that lies in a convex hull of the base vectors while minimizing the class prediction loss, and ii) a mini-max classifier that aims for robustness of the individual weight vectors by minimizing the loss of the worst-performing strategy among the base vectors. We provide efficient solution methods for these optimization problems. The effectiveness and robustness of the proposed approaches are demonstrated on several benchmark document datasets, significantly outperforming the existing term weighting methods.

Robust Speaker Identification Using Linear Transformation Optimized for Diagonal Covariance GMM (대각공분산 GMM에 최적인 선형변환을 이용한 강인한 화자식별)

  • Kim, Min-Seok;Yang, Il-Ho;Yu, Ha-Jin
    • MALSORI
    • /
    • no.65
    • /
    • pp.67-80
    • /
    • 2008
  • We have been building a text-independent speaker recognition system that is robust to unknown channel and noise environments. In this paper, we propose a linear transformation to obtain robust features. The transformation is optimized to maximize the distances between the Gaussian mixtures. We use rotation of the axes, to cope with the problem of scaling the transformation matrix. The proposed transformation is similar to PCA or LDA, but can achieve better result in some special cases where PCA and LDA can not work properly. We use YOHO database to evaluate the proposed method and compare the result with PCA and LDA. The results show that the proposed method outperforms all the baseline, PCA and LDA.

  • PDF

Standardization Study of Font Shape Classification for Hangul Font Registration System (한글 글꼴 등록 시스템을 위한 글꼴 모양 분류체계 표준화 연구)

  • Kim, Hyun-Young;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.3
    • /
    • pp.571-580
    • /
    • 2017
  • Recently, there are many communication softwares based on text on various smart devices. Unlike traditional print publishing, mobile publishing and SNS tools tends to utilize more decorative or more emotional fonts so that users can pass some feelings from contents. So font providers have released new fonts which deal with the requirements of the market. Nevertheless being released lots of new fonts, general users have not used them because they searched only by font name or font provider's name. It means that there is no way for users to know and find new things. In this study, we suggest font shape classification rules for font registration system based on font design features. We proved the validity of classification standard study through some experiments with 50 commercial fonts. Also the result of this study was provided for Korea Telecommunication Technology Association and adopted by the Korea industrial standard.

Edge-based Text Localization Using Geometrical Features of Hangul Character in Mobile Images (모바일 영상에서 한글 문자의 기하학적 특징을 이용한 에지 기반 텍스트 검출)

  • Park, Jong-Cheon;Oh, Myoung-Kwan;Jeon, Byeong-Min
    • Proceedings of the KAIS Fall Conference
    • /
    • 2012.05b
    • /
    • pp.820-822
    • /
    • 2012
  • 최근 모바일 기기의 보급이 일반화됨으로서 모바일 영상을 다루는 많은 응용프로그램이 개발되고 있다. 모바일 영상을 분석하여 정보를 추출한 결과를 인터넷 검색의 키워드로 연계함으로서 직관적인 멀티미디어 검색을 가능하도록 한다. 본 연구는 모바일 영상에 포함된 한글 문자영역을 검출하는 방법을 제안하였다. 한글 문자의 기하학적인 특징을 추출하고 이를 분석함으로서 후보 한글문자 영역을 검출하고, 검출된 후보 문자영역을 한글 자소 병합 알고리즘을 이용하여 병합한다. 그리고 후보 문자 영역을 한글 6가지 한글 문자 유형 특징을 이용하여 한글 문자 영역을 판별함으로서 최종적인 한글 문자영역을 검출한다. 실험결과, 문자영역 검출률의 성능 평가 요소로서 재현률이 향상됨을 알 수 있었다.

  • PDF

The MSW Pyrolysis & Melting Plant DONGBU R21 (생활폐기물 열분해용융시설 동부 R21)

  • Choi, Sang-Sim;Kim, Seok-Hwan;Kim, Kyong-Lae
    • 한국연소학회:학술대회논문집
    • /
    • 2004.06a
    • /
    • pp.314-328
    • /
    • 2004
  • Mitsui Engineering and Shipbuilding Co., Ltd. (MES) has completed Recycling 21 (R21) pyrolysis and melting technology for municipal solid wastes. The basic technology is licensed from Siemens, but MES has made major improvements to the design and operation of the R21 system Consequently, up to now, MES has been completed six (6) R21 plants in Japan. The following text will provide a brief overview of the design & operating features of R21 technology, focused on the reliability of system and low emission of hazardous material, which have been proved by the successful construction & operation experience of the plants.

  • PDF

A Study on Method of Emotional Expression of the Naxi Dongba script (나시족 동파문자의 시각적 감성 표현에 관한 연구)

  • Zhang, zhong hui;Lee, dong hun
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.1010-1014
    • /
    • 2009
  • Acceleration of the globalization process as they are internationally accepted, standardized. A lot of somebody started to feel tired of the standards symbols, and like to pursuit the original and natural, and hope that the new interest in the emergence of visual symbols. Dongba (東巴文字) text are Naxi(納西族) Dongba pictograph (象形文字) of a concrete nature in china, Characterized by simplicity, abstraction, associative, interesting, decorative, symbolic etc. These features can quickly identify and satisfy to the people's demand for special aesthetic.

  • PDF

Mini PC control system for BYG type water supply units (BYG형 급수기의 MINI PC 제어 시스템)

  • 박용규;강영모
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1993.10a
    • /
    • pp.1167-1171
    • /
    • 1993
  • A highly efficient hydropneumatic water supply system type BGY is designed and built in accordance with ISO standard. The technical features of BYG type pump unit can be summarized as follows: - reduce hydropneumatic tank capacity at the ratio of 1/10 - 1/30 compared with conventional method. - ISO standard pumps can be used. - the development of highly efficient water supply system type BYG is based on long-term experiences with the proven constant pressure water supply technique which minimize pressure fluctuation, rapid pilsation, etc. The text contains the operation principle of BYG type water supply system, introduction of closed cycle control process focused on Mini PC and experimental results of type BYG-IVS-90x45.

  • PDF

Framework for Ontological Knowledge-based Image Understanding Systems (Ontological 지식 기반 영상이해시스템의 구조)

  • 손세호;이인근;권순학
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.04a
    • /
    • pp.235-240
    • /
    • 2004
  • In this paper, we propose a framework for ontological knowledge-based image understanding systems. Ontology composed of concepts can be used as a guide for describing objects from a specific domain of interest and describing relations between objects from different domains The proposed framework consists of four main subparts ⅰ) ontological knowledge bases, ⅱ) primitive feature detectors, ⅲ) concept inference engine, and ⅳ) semantic inference engine. Using ontological knowledge bases on various domains and features extracted from the detectors, concept inference engine infers concepts on regions of interest in an image and semantic inference engine reasons semantic situations between concepts from different domains. We present a outline for ontological knowledge-based image understanding systems and application examples within specific domains such as text recognition and human recognition in order to show the validity of the proposed system.

  • PDF

2D Design Feature Recognition using Expert System (전문가 시스템을 이용한 2차원 설계 특징형상의 인식)

  • 이한민;한순흥
    • Korean Journal of Computational Design and Engineering
    • /
    • v.6 no.2
    • /
    • pp.133-139
    • /
    • 2001
  • Since a great number of 2D engineering drawings are being used in industry and at the same time 3D CAD becomes popular in recent years, we need to reconstruct 3D CAD models from 2D legacy drawings. In this thesis, a combination of a feature recognition method and an expert system is suggested for the 3D solid model reconstruction. Modeling primitives of 3D CAD systems are recognized and constructed by using the pattern matching technique of the features modeling. Additional information for the 3D model reconstruction can be generated by extracting symbols or text entities which are related to form entities. For complex and indefinite cases which cannot be solved by the process of feature recognition, an expert system with a rule base has been used for decision-making. A 3D reconstruction system which recognizes 2D DXF drawing files has been implemented where models composed with protrusions, holes, and cutouts can be handled.

  • PDF

Knowledge-Based Numeric Open Caption Recognition for Live Sportscast

  • Sung, Si-Hun
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.1871-1874
    • /
    • 2003
  • Knowledge-based numeric open caption recognition is proposed that can recognize numeric captions generated by character generator (CG) and automatically superimpose a modified caption using the recognized text only when a valid numeric caption appears in the aimed specific region of a live sportscast scene produced by other broadcasting stations. in the proposed method, mesh features are extracted from an enhanced binary image as feature vectors, then a valuable information is recovered from a numeric image by perceiving the character using a multiplayer perceptron (MLP) network. The result is verified using knowledge-based hie set designed for a more stable and reliable output and then the modified information is displayed on a screen by CG. MLB Eye Caption based on the proposed algorithm has already been used for regular Major League Base-ball (MLB) programs broadcast five over a Korean nationwide TV network and has produced a favorable response from Korean viewer.

  • PDF