• Title/Summary/Keyword: sentence symbol processing

Search Result 3, Processing Time 0.019 seconds

Korean Sentence Symbol Preprocess System for the Improvement of Speech Synthesis Quality (음성 합성 시스템의 품질 향상을 위한 한국어 문장 기호 전처리 시스템)

  • Lee, Ho-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.149-156
    • /
    • 2015
  • In this paper, we propose a Korean sentence symbol preprocessor for a SSML (speech synthesis markup language) supported speech synthesis system in order to improve the quality of the synthesized result. After the analysis of Korean Wikipedia documents, we propose 8 categories for the meaning of sentence symbols and 11 regular expression for the classification of each category. After the development of a Korean sentence symbol preprocess system we archived 56% of precision and 71.45% of recall ratio for 63,000 sentences.

A Study on the Computer­Aided Processing of Sentence­Logic Rule (문장논리규칙의 컴퓨터프로세싱을 위한 연구)

  • Kum, Kyo-young;Kim, Jeong-mi
    • Journal of Korean Philosophical Society
    • /
    • v.139
    • /
    • pp.1-21
    • /
    • 2016
  • To quickly and accurately grasp the consistency and the true/false of sentence description, we may require the help of a computer. It is thus necessary to research and quickly and accurately grasp the consistency and the true/false of sentence description by computer processing techniques. This requires research and planning for the whole study, namely a plan for the necessary tables and those of processing, and development of the table of the five logic rules. In future research, it will be necessary to create and develop the table of ten basic inference rules and the eleven kinds of derived inference rules, and it will be necessary to build a DB of those tables and the computer processing of sentence logic using server programming JSP and client programming JAVA over its foundation. In this paper we present the overall research plan in referring to the logic operation table, dividing the logic and inference rules, and preparing the listed process sequentially by dividing the combination of their use. These jobs are shown as a variable table and a symbol table, and in subsequent studies, will input a processing table and will perform the utilization of server programming JSP, client programming JAVA in the construction of subject/predicate part activated DB, and will prove the true/false of a sentence. In considering the table prepared in chapter 2 as a guide, chapter 3 shows the creation and development of the table of the five logic rules, i.e, The Rule of Double Negation, De Morgan's Rule, The Commutative Rule, The Associative Rule, and The Distributive Rule. These five logic rules are used in Propositional Calculus, Sentential Logic Calculus, and Statement Logic Calculus for sentence logic.

Analysis of Korean Language Parsing System and Speed Improvement of Machine Learning using Feature Module (한국어 의존 관계 분석과 자질 집합 분할을 이용한 기계학습의 성능 개선)

  • Kim, Seong-Jin;Ock, Cheol-Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.8
    • /
    • pp.66-74
    • /
    • 2014
  • Recently a variety of study of Korean parsing system is carried out by many software engineers and linguists. The parsing system mainly uses the method of machine learning or symbol processing paradigm. But the parsing system using machine learning has long training time because the data of Korean sentence is very big. And the system shows the limited recognition rate because the data has self error. In this thesis we design system using feature module which can reduce training time and analyze the recognized rate each the number of training sentences and repetition times. The designed system uses the separated modules and sorted table for binary search. We use the refined 36,090 sentences which is extracted by Sejong Corpus. The training time is decreased about three hours and the comparison of recognized rate is the highest as 84.54% when 10,000 sentences is trained 50 times. When all training sentence(32,481) is trained 10 times, the recognition rate is 82.99%. As a result it is more efficient that the system is used the refined data and is repeated the training until it became the steady state.