• Title/Summary/Keyword: Text Matching

Search Result 148, Processing Time 0.026 seconds

The Study on Lossy and Lossless Compression of Binary Hangul Textual Images by Pattern Matching (패턴매칭에 의한 이진 한글문서의 유.무손실 압축에 관한 연구)

  • 김영태;고형화
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.4
    • /
    • pp.726-736
    • /
    • 1997
  • The textual image compression by pattern matching is a coding scheme that exploits the correlations between patterns. When we compress the Hangul (Korean character) text by patern matching, the collerations between patterns may decrease due to randoem contacts between phonemes. Therefore in this paper we separate connected phonemes to exploit effectively the corrlation between patterns by inducting the amtch. In the process of sequation, we decide whether the patterns have vowel component or not, and then vowels connected with consonant ae separated. When we compare the proposed algorithm with the existing algorith, the compression ratio is increased by 1.3%-3.0% than PMS[5] in lossy mode, by 3.4%-9.1% in lossless mode than that of SPM[7] which is submitted to standard committe for second generation binary compression algorithm.

  • PDF

Interactive Typography System using Combined Corner and Contour Detection

  • Lim, Sooyeon;Kim, Sangwook
    • International Journal of Contents
    • /
    • v.13 no.1
    • /
    • pp.68-75
    • /
    • 2017
  • Interactive Typography is a process where a user communicates by interacting with text and a moving factor. This research covers interactive typography using real-time response to a user's gesture. In order to form a language-independent system, preprocessing of entered text data presents image data. This preprocessing is followed by recognizing the image data and the setting interaction points. This is done using computer vision technology such as the Harris corner detector and contour detection. User interaction is achieved using skeleton information tracked by a depth camera. By synchronizing the user's skeleton information acquired by Kinect (a depth camera,) and the typography components (interaction points), all user gestures are linked with the typography in real time. An experiment was conducted, in both English and Korean, where users showed an 81% satisfaction level using an interactive typography system where text components showed discrete movements in accordance with the users' gestures. Through this experiment, it was possible to ascertain that sensibility varied depending on the size and the speed of the text and interactive alteration. The results show that interactive typography can potentially be an accurate communication tool, and not merely a uniform text transmission system.

RECENT RESEARCH AND DEVELOPING TREND OF ENGINEERING MANAGEMENT IN CHINA BASED ON TEXT MINING

  • Shaohua Jiang;Wenling Zhang;Zhaohong Qiu;Shaojun Wang
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.814-820
    • /
    • 2009
  • With the rapid development of China economy, many engineering projects with large scale and investment were constructed in China and some were the biggest ones in the world. With the development of engineering practice, great progress in the research of engineering management of China was made and a large number of research findings were embodied in content of research papers and were represented by technical words. To know the state of arts in the research field of engineering management in China, three major parts, namely title, abstract and keywords of research papers in last five years from three representative Chinese journals about engineering management were chose as research materials. Unlike western languages, there are no delimiters between the words of Chinese, so the maximum matching and frequency statistics (MMFS) method, a text segmentation technique of text mining Chinese, was presented to extract the features consisting of technical words, phrases and words from the research materials. Recent research and developing trend of engineering management in China were found by comparing and analyzing the difference of technical words in the research materials of last five years.

  • PDF

Improving the Performance of Statistical Automatic Text Categorization by using Phrasal Patterns and Keyword Sets (구문 패턴과 키워드 집합을 이용한 통계적 자동 문서 분류의 성능 향상)

  • Han, Jeong-Gi;Park, Min-Gyu;Jo, Gwang-Je;Kim, Jun-Tae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.4
    • /
    • pp.1150-1159
    • /
    • 2000
  • This paper presents an automatic text categorization model that improves the accuracy by combining statistical and knowledge-based categorization methods. In our model we apply knowledge-based method first, and then apply statistical method on the text which are not categorized by knowledge-based method. By using this combined method, we can improve the accuracy of categorization while categorize all the texts without failure. For statistical categorization, the vector model with Inverted Category Frequency (ICF) weighting is used. For knowledge-based categorization, Phrasal Patterns and Keyword Sets are introduced to represent sentence patterns, and then pattern matching is performed. Experimental results on new articles show that the accuracy of categorization can be improved by combining the tow different categorization methods.

  • PDF

A Study on Software Education Donation Model for the Social Care Class

  • Lee, Won Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.1
    • /
    • pp.239-246
    • /
    • 2019
  • In this paper, we propose an effective software education donation model for the social care class. The types of software education for elementary, middle, and high school for the social care class are in the order of after school classes, club activities, creative experiences, and regular classes. In elementary school students, it is effective to precede visual programming education based on block coding and to conduct curriculum convergence with SW and HW at the beginning, and high school students are carrying out text programming education like Python. Software education for social care class The contribution activity model can be classified into five types such as geographically difficult area, multicultural family areas, orphanage, reformatory, and basic livelihood security recipient. In addition, the survey results show that the students' interest in software education and their satisfaction are all very high at 96%. Effective software education for the social care class In the donation model, the lecturers consist of responsible professors, lecturers, and assistant instructors. Software training for the social care class is effective on a year-by-year basis, so that students can feel authenticity and trust. Software education contents focus on visual programming and physical computing education in elementary or middle school, and text programming and physical computing education in high school. It is necessary to construct a software education donor matching system that helps efficient management of software education donations by efficiently matching schools (consumers: elementary, middle, high school) and software education donors(suppliers).

Implementation of an efficient Pocket PC- based Hangul Matching System (Pocket PC기반의 효율적인 한글 정합 시스템 구현)

  • Park Jong-Min;Cho Beom-Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.7
    • /
    • pp.1546-1552
    • /
    • 2004
  • Electronic Ink is a stored data in the form of the handwritten text or the script without converting it into ASCII by handwritten recognition on the pen-based computers and Personal Digital Assistants(Pocket PC) for supporting natural and convenient data input. One of the most important issues is to search the electronic ink in order to use it. We proposed and implemented a script matching algorithm for the electronic ink. Proposed matching algorithm separated the input stroke into a set of primitive stroke using the curvature of the stroke curve. After determining the type of separated strokes, it produced a stroke feature vector. And then it calculated the distance between the stroke feature vector of input strokes and one of strokes in the database using the dynamic programming technique.

HearCAM Embedded Platform Design (히어 캠 임베디드 플랫폼 설계)

  • Hong, Seon Hack;Cho, Kyung Soon
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.10 no.4
    • /
    • pp.79-87
    • /
    • 2014
  • In this paper, we implemented the HearCAM platform with Raspberry PI B+ model which is an open source platform. Raspberry PI B+ model consists of dual step-down (buck) power supply with polarity protection circuit and hot-swap protection, Broadcom SoC BCM2835 running at 700MHz, 512MB RAM solered on top of the Broadcom chip, and PI camera serial connector. In this paper, we used the Google speech recognition engine for recognizing the voice characteristics, and implemented the pattern matching with OpenCV software, and extended the functionality of speech ability with SVOX TTS(Text-to-speech) as the matching result talking to the microphone of users. And therefore we implemented the functions of the HearCAM for identifying the voice and pattern characteristics of target image scanning with PI camera with gathering the temperature sensor data under IoT environment. we implemented the speech recognition, pattern matching, and temperature sensor data logging with Wi-Fi wireless communication. And then we directly designed and made the shape of HearCAM with 3D printing technology.

Implementation of JBIG2 CODEC with Effective Document Segmentation (문서의 효율적 영역 분할과 JBIG2 CODEC의 구현)

  • 백옥규;김현민;고형화
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.6A
    • /
    • pp.575-583
    • /
    • 2002
  • JBIG2 is an International Standard fur compression of Bi-level images and documents. JBIG2 supports three encoding modes for high compression according to region features of documents. One of which is generic region coding for bitmap coding. The basic bitmap coder is either MMR or arithmetic coding. Pattern matching coding method is used for text region, and halftone pattern coding is used for halftone region. In this paper, a document is segmented into line-art, halftone and text region for JBIG2 encoding and JBIG2 CODEC is implemented. For efficient region segmentation of documents, region segmentation method using wavelet coefficient is applied with existing boundary extraction technique. In case of facsimile test image(IEEE-167a), there is improvement in compression ratio of about 2% and enhancement of subjective quality. Also, we propose arbitrary shape halftone region coding, which improves subjective quality in talc neighboring text of halftone region.

Effective Cross-Lingual Text Retrieval using a Fuzzy Knowledge Base (퍼지 지식베이스를 이용한 효과적인 다언어 문서 검색)

  • Choi, Myeong-Bok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.1
    • /
    • pp.53-62
    • /
    • 2008
  • Cross-lingual text retrieval(CLTR) is the information retrieval in which a user tries to search a set of documents written in one language for a query another language. This thesis proposes a CLTR system based on fuzzy multilingual thesaurus to handle a partial matching between terms of two different languages. The proposed CLTR system uses a fuzzy term matrix defined in our thesis to perform the information retrieval effectively. In the defined fuzzy term matrix, all relation degrees between terms are inferred from using the transitive closure algorithm to reflect all implicit links between terms into processing of the information retrieval. With this framework, the CLTR system proposed in our thesis enhances the retrieval effectiveness because it is able to emulate a human expert's decision making well in CLTR.

  • PDF