• Title/Summary/Keyword: Part-Of-Speech Determination

Search Result 3, Processing Time 0.016 seconds

Probabilistic Part-Of-Speech Determination for Efficient English-Korean Machine Translation (효율적 영한기계번역을 위한 확률적 품사결정)

  • Kim, Sung-Dong;Kim, Il-Min
    • The KIPS Transactions:PartB
    • /
    • v.17B no.6
    • /
    • pp.459-466
    • /
    • 2010
  • Natural language processing has several ambiguity problems, and English-Korean machine translation especially includes those problems to be solved in each translation step. This paper focuses on resolving part-of-speech ambiguity of English words in order to improve the efficiency of English analysis, which is in part of efforts for developing practical English-Korean machine translation system. In order to improve the efficiency of the English analysis, the part-of-speech determination must be fast and accurate for being integrated with machine translation system. This paper proposes the probabilistic models for part-of-speech determination. We use Penn Treebank corpus in building the probabilistic models. In experiment, we present the performance of the part-of-speech determination models and the efficiency improvement of the machine translation system by the proposed part-of-speech determination method.

A Model of English Part-Of-Speech Determination for English-Korean Machine Translation (영한 기계번역에서의 영어 품사결정 모델)

  • Kim, Sung-Dong;Park, Sung-Hoon
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.3
    • /
    • pp.53-65
    • /
    • 2009
  • The part-of-speech determination is necessary for resolving the part-of-speech ambiguity in English-Korean machine translation. The part-of-speech ambiguity causes high parsing complexity and makes the accurate translation difficult. In order to solve the problem, the resolution of the part-of-speech ambiguity must be performed after the lexical analysis and before the parsing. This paper proposes the CatAmRes model, which resolves the part-of-speech ambiguity, and compares the performance with that of other part-of-speech tagging methods. CatAmRes model determines the part-of-speech using the probability distribution from Bayesian network training and the statistical information, which are based on the Penn Treebank corpus. The proposed CatAmRes model consists of Calculator and POSDeterminer. Calculator calculates the degree of appropriateness of the partof-speech, and POSDeterminer determines the part-of-speech of the word based on the calculated values. In the experiment, we measure the performance using sentences from WSJ, Brown, IBM corpus.

  • PDF

An Acoustical Study of Korean Diphthongs (한국어 이중모음의 음향학적 연구)

  • Yang Byeong-Gon
    • MALSORI
    • /
    • no.25_26
    • /
    • pp.3-26
    • /
    • 1993
  • The goals of the present study were (3) to collect and analyze sets of fundamental frequency (F0) and formant frequency (F1, F2, F3) data of Korean diphthongs from ten linguistically homogeneous speakers of Korean males, and (2) to make a comparative study of Korean monophthongs and diphthongs. Various definitions, kinds, and previous studies of diphthongs were examined in the introduction. Procedures for screening subjects to form a linguistically homogeneous group, time point selection and formant determination were explained in the following section. The principal findings were as follows: 1. Much variation was observed in the ongliding part of diphthongs. 2. F2 values of (j) group descended while those of [w] group ascended, 3. The average duration of diphthongs were about 110 msec, and there was not much variation between speakers and diphthongs. 4. In a comparative study of monophthongs and diphthongs, Fl and F2 values of the same offgliding part at the third time point almost converged. 5. The gliding of diphthongs was very short beginning from the h-noise. Perceptual studies using speech synthesis are desirable to find major parameters for diphthongs. The results of the present study wi11 be useful in the area of automated speech recognition and computer synthesis of speech.

  • PDF