• Title/Summary/Keyword: Compound Noun Phrase

Search Result 9, Processing Time 0.022 seconds

Chunking of Contiguous Nouns using Noun Semantic Classes (명사 의미 부류를 이용한 연속된 명사열의 구묶음)

  • Ahn, Kwang-Mo;Seo, Young-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.3
    • /
    • pp.10-20
    • /
    • 2010
  • This paper presents chunking strategy of a contiguous nouns sequence using semantic class. We call contiguous nouns which can be treated like a noun the compound noun phrase. We use noun pairs extracted from a syntactic tagged corpus and their semantic class pairs for chunking of the compound noun phrase. For reliability, these noun pairs and semantic classes are built from a syntactic tagged corpus and detailed dictionary in the Sejong corpus. The compound noun phrase of arbitrary length can also be chunked by these information. The 38,940 pairs of 'left noun - right noun', 65,629 pairs of 'left noun - semantic class of right noun', 46,094 pairs of 'semantic class of left noun - right noun', and 45,243 pairs of 'semantic class of left noun - semantic class of right noun' are used for compound noun phrase chunking. The test data are untrained 1,000 sentences with contiguous nouns of length more than 2randomly selected from Sejong morphological tagged corpus. Our experimental result is 86.89% precision, 80.48% recall, and 83.56% f-measure.

Effective Thematic Words Extraction from a Book using Compound Noun Phrase Synthesis Method

  • Ahn, Hee-Jeong;Kim, Kee-Won;Kim, Seung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.3
    • /
    • pp.107-113
    • /
    • 2017
  • Most of online bookstores are providing a user with the bibliographic book information rather than the concrete information such as thematic words and atmosphere. Especially, thematic words help a user to understand books and cast a wide net. In this paper, we propose an efficient extraction method of thematic words from book text by applying the compound noun and noun phrase synthetic method. The compound nouns represent the characteristics of a book in more detail than single nouns. The proposed method extracts the thematic word from book text by recognizing two types of noun phrases, such as a single noun and a compound noun combined with single nouns. The recognized single nouns, compound nouns, and noun phrases are calculated through TF-IDF weights and extracted as main words. In addition, this paper suggests a method to calculate the frequency of subject, object, and other roles separately, not just the sum of the frequencies of all nouns in the TF-IDF calculation method. Experiments is carried out in the field of economic management, and thematic word extraction verification is conducted through survey and book search. Thus, 9 out of the 10 experimental results used in this study indicate that the thematic word extracted by the proposed method is more effective in understanding the content. Also, it is confirmed that the thematic word extracted by the proposed method has a better book search result.

Integrated Indexing Method using Compound Noun Segmentation and Noun Phrase Synthesis (복합명사 분할과 명사구 합성을 이용한 통합 색인 기법)

  • Won, Hyung-Suk;Park, Mi-Hwa;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.1
    • /
    • pp.84-95
    • /
    • 2000
  • In this paper, we propose an integrated indexing method with compound noun segmentation and noun phrase synthesis. Statistical information is used in the compound noun segmentation and natural language processing techniques are carefully utilized in the noun phrase synthesis. Firstly, we choose index terms from simple words through morphological analysis and part-of-speech tagging results. Secondly, noun phrases are automatically synthesized from the syntactic analysis results. If syntactic analysis fails, only morphological analysis and tagging results are applied. Thirdly, we select compound nouns from the tagging results and then segment and re-synthesize them using statistical information. In this way, segmented and synthesized terms are used together as index terms to supplement the single terms. We demonstrate the effectiveness of the proposed integrated indexing method for Korean compound noun processing using KTSET2.0 and KRIST SET which are a standard test collection for Korean information retrieval.

  • PDF

Intonational Realization and Perception of English Noun Phrases and Compound Nouns (영어 명사구와 복합명사의 억양 실현 양상과 지각)

  • Kang, Sun-Mi;Kim, Mi-Hye;Jeon, Yoon-Shil;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.153-166
    • /
    • 2005
  • This paper attempts to examine the accent implementation and perception of noun phrases and compound nouns in English sentences, arguing that primary stress of noun phrase and compound noun is realized in relative prominence in intonation. The production test examines how the stress patterns of the noun phrases and compound nouns are realized in intonation of the English native speakers' utterances. The perception test investigates English and Korean listeners' comprehension of the intonation of the noun phrases and compound nouns. And the results of this experimental study show that speakers and listeners produce and perceive the primary stress as a relatively prominent accent even if in contrast of English listeners, Korean learners have difficulty in using the cue of pitch accent location and figuring out compound nouns and noun phrases.

  • PDF

A Method Of Compound Noun Phrase Indexing for Resolving Syntactic Diversity (구문 다양성 해소를 위한 복합명사구 색인 방법)

  • Cho, Min-Hee;Jeong, Do-Heon
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.3
    • /
    • pp.467-476
    • /
    • 2011
  • Compound noun phrase (CNP) is important factor for semantic information process because the meaning of the CNP is more disambiguous than that of single word. However, the CNP can be expressed in various types even though it expresses same meaning. It is called syntactic diversity. It makes information system difficult to grasp sense identity. In order to resolve the syntactic diversity in this research, we propose an indexing method for compound noun phrase. The main purpose is to make identical index term for various types of CNPs which has same meaning. To do so, the research follows next steps. For the first, we make rule template and utilize the template to extract CNPs from set of domestic research papers. In general, the CNP has a unique meaning. Considering the characteristic, we suggest synthesis rules of index terms and apply the rule to CNPs extracted in previous step. For the objective performance evaluation of the research, a test set, HANTEC 2.0, was utilized and the result was compared to baseline model. Through the experiment and the evaluation, we have confirmed that the indexing method suggested in this paper could positively affect retrieval precision and improve performance of the information retrieval.

The Incredible Shrinking Noun Phrase: Ongoing Change in Japanese Word Formation

  • Kevin Heffernan;Yusuke Imanishi
    • Asia Pacific Journal of Corpus Research
    • /
    • v.4 no.1
    • /
    • pp.1-23
    • /
    • 2023
  • The Japanese language, as a typical agglutinating language, permits large noun phrases (NP) containing ten or more morphemes. In this paper, we argue that the nature of the NP in Japanese is changing. Our data are drawn from the Balanced Corpus of Contemporary Written Japanese. We conduct a series of apparent-time studies of ongoing changes in complex NPs. We first examine the length of compound nouns, followed by the usage of bound suffixes. We then examine ongoing changes in complex NPs that contain genitive case markers. Finally, we examine noun incorporation. All of our studies show a trend towards shorter, less complex NPs. Furthermore, our results suggest that the usage rate of phrases that modify the noun inside the NP (compound nouns, bound nouns, NPs containing genitive case, noun incorporation) appears to be decreasing over time. On the other hand, the usage rate of modifying material outside of the NP (positional phrases, relative clauses) appears to be increasing over time. We conclude by suggesting that our results reflect a diachronic change of decreasing synthetic morphology and increasing analytic morphology. We end by pointing out the implications of this work on our understanding syntheticity and analyticity.

Morphological Analysis of the Korean Language (한국어의 형태소해석)

  • Lee, Soo-Hyon;Ozawa, S.;Lee, Joo-Keun
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.4
    • /
    • pp.53-61
    • /
    • 1989
  • A morphological analysis is described to extract the informations which are required in syntactic and semantic analysis of the Korean language. The noun and particle are separated in a noun phrase, the selecting conditions are specified to analyze the compound noun and a restoring rule is represented to process the irregular compound noun. The stem and ending are separated in normal verbals and a logical representive form is proposed to the anomalously inflected word and contracted vowels. The logical representation is composed of the attribute value an analyzing rule. The redundancy of noun is reduced in the dictionary as the verb of a "Nounformed HA-" is processed by "noun" and "HA-", separately and a predicative "IDA" is analyzed by Q parameter. The processing form of negation is also derived and the morpheme and basic structure of compound predicative parts are presented.

  • PDF

An Index System using Restrictive Distance (거리 제한을 이용한 색인 시스템)

  • Park, Chan-Ee;Kim, Sang-Bok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.1 s.39
    • /
    • pp.273-282
    • /
    • 2006
  • In this paper, we propose index method introducing distance concept in word by a method weighting word. This index method is frequent representing an inquiry word and document index and compound noun or more than two adjoin nouns or noun phrase, the farther the distance between these nouns, the fewer selected ratio decreases in index point is the aiming, this choose guide word candidate by existent weight grant method and distance between candidates chose candidate finally in index within 3 sentences. Using in these way I document of 100 kinds of newspaper, scientific treatise, web document and so on, showed the correctness rate resulted of newspaper 92.03% scientific treatise 95% web document 73.33%.

  • PDF

한국어 합성 동사성 명사의 어휘구조와 다중 동사성명사 구문

  • 류병래
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2001.06a
    • /
    • pp.141-144
    • /
    • 2001
  • 본 논문의 목적은 ‘다중 동사성 명사 구문’(Multiple Verbal Noun Construe-tions)의 논항실현 양상을 이론 중립적으로 고찰해 보고, 이 분석을 제약기반 문법 이론인 최근의 핵 심어주도 구구조문법 (Head-driven Phrase Structure Grammar)틀 안에서, 특히 다중계승위 계를 가정하는 제약기반 어휘부를 기반으로 형식화해 논항의 실현과정을 기술하고 설명하는 것이다. 우선 일본어의 유사한 현상을 분석한 Grimshaw & Mester (1988)의 격실현 양상에 관한 일반화를 기반으로 한국어 동사성명사구문의 논항실현 양상을 ‘논항전이’ (argument transfer)라는 이론적 장치를 이용해 형식화할 수 있음을 보이고, 동사성 합성명사의 논항구조를 만들기 위해 ‘논항합성’(argument composition)이라는 이론적 장치를 제안한다. 나아가서 다중 동사성 명사구문의 논항실현 과정에서 보이는 겹격표지 현상을 ‘격 복사’(case copying)를 제안해 동사성 명사의 격표지가 합성 명사에서 분리되어 문장단위에서 실현될 때 동일한 격을 복사해 실현한다는 점을 주장하고자 한다. 이 주장을 뒷받침하기 위해 수동과 능동 등 문법기능의 변화현상에서 하위범주화된 요소들의 격변화가 자의적이 아님을 실례를 들어 보여 주고자 한다. 일본어의 경동사 (light verbs)에 관한 분석 인 Grimshaw Meste, (1988) 이래 한국어에서도 이와 유사한 구문에 대한 재조명이 활발하게 이루어져 왔다 (Ryu (1993b), 채희락 (1996), Chae (1997) 등 참조). 한국어에서 ‘하다’와 동사성명사(verbal nouns)가 결합하여 이루어진 ‘동사성명사구문’ (Verbal Noun Constructions)에 대한 기존의 논의는 대부분 하나의 동사성 명사가 ‘하다’나 ‘되다등 소위 문법기능을 바꾸는 ‘경동사’들과 결합하여 복합술어가 되는 문법적 현상에 초점이 맞춰져 있었다. 그와 비교해서 동사성 명사의 어근이 두 개 이상 결합하여 동사성명사들끼리 합성명사(compound nouns)를 이루고 그 동사성 합성명사가 문법기능의 변화를 바꾸는 ‘경동사’와 결합하여 이루어진 복합술어에 대해서는 논의가 거의 없는 형편이다. 특히 이 지적은 핵심어주도 구절구조문법틀 내에서는 논란의 여지가 없다. 본 논문의 대상은 바로 이러한 합성 동사성명사의 논항구조와 동사성명사에 의해 하위범주화된 논항들의 문법적 실현양상이다.

  • PDF