• Title/Summary/Keyword: syntactic category

Search Result 31, Processing Time 0.023 seconds

Syntactic Category Prediction for Improving Parsing Accuracy in English-Korean Machine Translation (영한 기계번역에서 구문 분석 정확성 향상을 위한 구문 범주 예측)

  • Kim Sung-Dong
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.345-352
    • /
    • 2006
  • The practical English-Korean machine translation system should be able to translate long sentences quickly and accurately. The intra-sentence segmentation method has been proposed and contributed to speeding up the syntactic analysis. This paper proposes the syntactic category prediction method using decision trees for getting accurate parsing results. In parsing with segmentation, the segment is separately parsed and combined to generate the sentence structure. The syntactic category prediction would facilitate to select more accurate analysis structures after the partial parsing. Thus, we could improve the parsing accuracy by the prediction. We construct features for predicting syntactic categories from the parsed corpus of Wall Street Journal and generate decision trees. In the experiments, we show the performance comparisons with the predictions by human-built rules, trigram probability and neural networks. Also, we present how much the category prediction would contribute to improving the translation quality.

Combinatory Categorial Grammar for Korean

  • Han, Sung-Kook;Park, Chan-Gon
    • Annual Conference on Human and Language Technology
    • /
    • 1990.11a
    • /
    • pp.164-171
    • /
    • 1990
  • A commutative productive category is proposed to the current CCG for the syntactic analysis of free word order languages like Korean. The introduction of this sort of category is quite natural for categorial lexicon and functional operations. We present the theorical basis of productive category and examine the linguistic availability through typical syntactic structures of Korean.

  • PDF

Korean Question-Answering System using Syntactic-Relation Information (구문 관계 정보를 이용한 한국어 질의-응답 시스템)

  • 신승은;이대연;서영훈
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.2
    • /
    • pp.36-42
    • /
    • 2004
  • This paper describes the Korean Question answering system using the syntactic-relation information d verbs to overcome lack of reliable knowledge and linguistic resources. The syntactic-relation information consists d the original form d a verb, usual usage pattern, semantic category of each dependent noun, synonym verbs and passive verbs. We use the syntactic-relation information to parse sentences or phrases with usual usage pattern of the verb and semantic conditions of dependent components on the verb. We also use that information to parse answer candidate sentences, and find an answer from questioned case slot. Our experiments that usage of the syntactic-relation information of verbs to mm lack of reliable knowledge and linguistic resources can be utilized efficiently for the Korean question answering system.

  • PDF

A Constraint-based Approach to English Gerunds

  • Kim, Yong-Beom
    • Language and Information
    • /
    • v.7 no.2
    • /
    • pp.117-137
    • /
    • 2003
  • This paper attempts to provide an alternative analysis involving categorical issues related to English gerunds. Especially, this paper rejects Maulof's approach that creates a new syntactic category gerund by mixing nominal and verbal categories. This paper identifies two syntactic structures in English gerunds: nominal gerunds and verbal gerunds. This distinction is based on syntactic and semantic characteristics of each type and is intended to account for the external distribution and endocentricity of the construction. Treating verbal gerunds syntactically as verbal categories, this paper proposes that English verbal gerunds act like other verbal categories such as infinitives whereas nominal gerunds behaves much like derived nominals. This paper proposes a few lexical rules that can take care of the two types of gerunds. The proposal can be extended to prepositional complements as well as sentential subject positions. This proposal not only resolves the issues involving distributional properties of the gerund construction but also captures syntactic parallelism observable between gerunds and other verbal constructions in English.

  • PDF

The Text Analysis of Plasticity Expressed in the Modern Art to Wear (Part I) - Focused on the West Art Works since 1980s - (현대 예술의상에 표현된 조형성의 텍스트 분석 (제1보) - 1980년대 이후 서구작가 작품을 중심으로 -)

  • Seo Seung Mi;Yang Sook Hi
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.29 no.6
    • /
    • pp.793-804
    • /
    • 2005
  • The new paradigm of the 21st century demand an openly different world of formative ideologies in respect to art and design. The purpose of this study is focused on trying to comprehend aesthetic essence of clothing as an, with the investigation of artistic theories manifested by art philosophers. Art to Wear was categorized into style to understand its artistic meaning as well as to analyze its character. Upon the foundation of semiotics theory, the feature of Art to Wear and its analysis category were argued in the context of Charles Morris three dimension of semiotics analysis. The conclusion to the research is like so. The feature and analysis category of Art to Wear upon a semiotics perspective was divided into syntactic dimension, semantic dimension and pragmatic dimension. The analytical categorization upon the perspective of syntactic dimension fell into the category of topology, shape and color. The semantic dimension of Art to Wear was divided into categories of denotation and connotation. In addition, the pragmatic dimension of Art to Wear analytical categorization was divided into a delivering function and common function.

Detection of Protein Subcellular Localization based on Syntactic Dependency Paths (구문 의존 경로에 기반한 단백질의 세포 내 위치 인식)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.375-382
    • /
    • 2008
  • A protein's subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and WordNet thesaurus. In the first step, we constructed syntactic dependency paths from each protein to its location candidate, and then converted the syntactic dependency paths into dependency trees. In the second step, we retrieved root information of the syntactic dependency trees. In the final step, we extracted syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extracted syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees from the training data, we extracted (protein, localization) pairs from the test sentences. Even with no biomolecular knowledge, our method showed reasonable performance in experimental results using Medline abstract data. Our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12-25%.

The Role of H Tone of an AP in Korean: The Relation Between Prosody and Morphology

  • Kang, Hyun-Sook
    • Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.7-23
    • /
    • 2008
  • This paper investigates tonal patterns of the prosodic constituents of an AP and a PWD in Korean and their relation with the morphological/syntactic structure. Specifically, this paper asks the following questions: First, if there are more than one PWD in an AP, how is each PWD specified in terms of tones? Secondly, in case that there is only one PWD in an AP that consists of several morphemes, is there any preference of the association between tones and the morphemes that constitute that PWD? Thirdly, if an AP dominates a PWD and if a PWD contains at least one morpheme of the lexical category, it follows that an AP should contain at least one morpheme of the lexical category. Can this be verified with the experimental data? In order to answer these questions, Experiment I and II were conducted with the target material consisting of a stem and suffixes that varied in length. The results of this preliminary test show that as the number of syllables in the target material increases, the more number of an AP tonal pattern occurs in it and as a result, in some cases, an AP consisting of suffixes only may occur.

  • PDF

A Structure of Passive Constructions in Korean and their meaning 'Potential' (한국어 피동문의 구조와 가능(potential)의 의미 해석 -대조적 관점에서-)

  • Mok, Jung-Soo;Kim, Yeong-Jung
    • Lingua Humanitatis
    • /
    • v.8
    • /
    • pp.369-387
    • /
    • 2006
  • Which syntactic function should we assign to the 'ga-type' constituent which occurs in the morphological passive constructions in Korean, [N0-neun N1-i Vpass-ending]? This problem is very important in two respects. First, a small change of status of the particle 'i/ga' can exert an overall influence on the Korean grammar. Second, the particle '-i/ga' cannot guarantee that 'ga-type' constituents are subject of the sentence, so that the concept of syntactic category should be distinguished from that of syntactic function. This paper claims that the analysis of sentence has long been focused on the structure of proposition, namely the argument structure and that the direction of analysis should be turned to the 'person structure' which can be revealed on the pragmatic level. On the basis of this, this paper suggests that the specific type of the morphological passive constructions in Korean, [N0-neun N1-i Vpass-ending] should be analysed in line with the psych-verb constructions and that the modal meaning 'potential' of the passive constructions is correlated with sentence pattern and 'person structure'.

  • PDF

Is Category P Lexical or Functional?: A Generalized pP-Shell Approach

  • Hong, Sung-Shim;Yang, Xiaodong
    • Language and Information
    • /
    • v.14 no.2
    • /
    • pp.71-84
    • /
    • 2010
  • The aim of this paper is to propose that a category P is encapsulated within a functional layer above the lexical layer, just like vP containing a lexical VP. As is well known, the category P has long been in the obscure domain of syntactic studies: Marantz (2001) and den Dikken (2003), for example, argue that P is a lexical category, but Emonds (1985), Grimshaw (1991), and Baker (2003), maintain that the category P is functional and is a closed category without its own intrinsic meaning. On the other hand, Zwart (2005) argues that it does have some meaning. Following the works of Svenonius (2003, 2006, 2007), and the spirit of Rizzi's (1997) split CP hypothesis, we elaborate and develop Svenonius' idea of split-pP analysis with detailed schematic representations of the novel examples in English, Korean, and Chinese in this paper. Unlike Svenonius, however, this paper incorporates KP into pP-Shell, which is a substantial simplification. Furthermore, Chinese Localizers that have long been considered as Postpositions are now under the category of Prepositions. This proposal renders an X-bar theoretic consistency over the categorical status of Chinese phrasal structures. In short, the present analysis accounts for inconsistency found in English complex preposition phrase (Quirk, et al, 1972, 1985), Chinese circumposition phrase (Ernst 1988, Liu, 2002) and Korean postposition phrase in a unified and consistent manner. Furthermore, by proposing a finer-grained phrasal architecture for the category P, the controversial status of the category subsides within this analysis.

  • PDF

Korean Syntactic Analysis by Using Clausal Segmentation of Embedded Clause (내포문의 단문 분할을 이용한 한국어 구문 분석)

  • Lee, Hyeon-Yeong;Lee, Yong-Seok
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.1
    • /
    • pp.50-58
    • /
    • 2008
  • Most of Korean sentences are complex sentences which consisted of main clause and embedded clause. These complex sentences have more than one predicate and this causes various syntactic ambiguities in syntactic analysis. These ambiguities are caused by phrase attachment problems which are occurred by the modifying scope of embedded clause. To resolve it, we decide the scope of embedded clause in the sentence and consider this clause as a unit of syntactic category. In this paper, we use sentence patterns information(SPI) and syntactic properties of Korean to decide a scope of embedded clause. First, we split the complex sentence into embedded clause and main clause by the method that embedded clause must have maximal arguments. This work is done by the SPI of the predicate in the embedded clause. And then, the role of this embedded clause is converted into a noun phrases or adverbial phrases in the main clause by the properties of Korean syntax. By this method, the structure of complex sentence is exchanged into a clause. And some phrases attachment problem, which is mainly caused by the modifying scope, is resolved easily. In this paper, we call this method clausal segmentation for embedded clause. By empirical results of parsing 1000 sentences, we found that our method decreases 88.32% of syntactic ambiguities compared to the method that doesn't use SPI and split the sentence with basic clauses.