Range Detection of Wa/Kwa Parallel Noun Phrase using a Probabilistic Model and Modification Information

Choi, Yong-Seok;Shin, Ji-Ae;Choi, Key-Sun;

한국정보과학회논문지:소프트웨어및응용 (Journal of KIISE:Software and Applications)

제35권2호
/
Pages.128-136
/
2008
/
1229-6848(pISSN)

한국정보과학회 (Korean Institute of Information Scientists and Engineers)

확률모형과 수식정보를 이용한 와/과 병렬사구 범위결정

Range Detection of Wa/Kwa Parallel Noun Phrase using a Probabilistic Model and Modification Information

최용석 (한국과학기술원 전산학과) ;
신지애 (정보통신대학교 공학부) ;
최기선 (한국과학기술원 전산학과)

발행 : 2008.02.15

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

한국어 구문 분석의 초기 단계로서 병렬구조의 해석은 파싱의 효율을 높일 수 있다. 본 논문은 병렬구조 해석을 위한 비지도식 언어에 독립적인 확률 모델을 제안한다. 이 모델은 병렬구조의 대칭성과 상호교환성에 근거한다. 대칭성은 같은 구조가 반복된다는 것이고, 교환성은 좌우 구성요소를 교환해도 같은 의미를 지닌다는 것이다. 병렬구조는 일반적으로 대칭성을 따르지만, 수식어의 성질에 따라서 한쪽만을 수식하는 비대칭적인 구조가 출현하기도 한다. 비대칭 병렬구조 해석을 위해서 추가적으로 수식관계 통계정보를 사용한다. 제안된 모델을 본 논문에서는 "와/과" 조사로 이루어진 한국어의 명사구 병렬구조를 해석하는데 사용되는 것[1]을 중점으로 보여준다. 지도적 방식에 의한 모델을 포함한 다른 모델들에 비해 효율적임을 실험적으로 보여준다.

Recognition of parallel structure at early stage of sentence parsing can reduce the complexity of parsing. In this paper, we propose an unsupervised language-independent probabilistic model for recongition of parallel noun structures. The proposed model is based on the idea of swapping constituents, which replies the properties of symmetry (two or more identical constituents are repeated) and of reversibility (the order of constituents is inter-changeable) in parallel structures. The non-symmetric patterns that cannot be captured by the general symmetry rule are resolved additionally by the modifier information. In particular this paper shows how the proposed model is applied to recognize Korean parallel noun phrases connected by "wa/kwa" particle. Our model is compared with other models including supervised models and performs better on recongition of parallel noun phrases.

키워드

참고문헌

Kurohashi, Sadao and Makoto Nagao, 1994a. KN Parser: Japanese dependency/case structure analyzer. In Proceedings of Workshop on Sharable Natural Language Resources, pages 4855
Abney, S., 'Parsing by Chunks,' In R.C. Berwick, S.P. Abney and C. Tenny, editors, Principle-Based Parsing: Computation and Psycholinguistics, Kluwer, pp. 257-278, 1991
이관규, '국어 대등구성 연구', 서광학술 자료사, 1992
박준식, '품사 패턴을 이용한 한국어 병렬 구문의 해석', 한국과학기술원 석사학위 논문, 1998
Kurohashi, S. and Nagao, M., 'A Syntactic analysis method of long Japanese sentences based on detection of conjunctive structures,' Computational Linguistics, Vol.20, No.4, pp. 507-534, 1994
Quinlan, J. Ross, 'C4.5:Programs for Machine Learning', Morgan Kaufmann Publishers, 1993
Joachims, Thorsten, Learning to Classify Text Using Support Vector Machines. Dissertation, Kluwer, 2002
Corbett, Edward P. J. Classical Rhetoric for the Modern Student. 3rd ed. NY: Oxford University Press, p. 428. 1990
The KAIST corpus 1996-1997, Korea Advanced Institute of Science and Technology, http://korterm.org/, 1997
Resnik, Philip, 'Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language,' Journal of Artificial Intelligence Research, Vol.11, pp. 95-130, 1999 https://doi.org/10.1613/jair.514
Jaynes, E.T., 'Information theory and statistical mechanics,' Physics Reviews106, pp. 620-630, 1957 https://doi.org/10.1103/PhysRev.106.620
Eric Sven Ristad. 1998. Maximum entropy modeling toolkit, release 1.6 beta. http://www.mnemonic. com/software/memt
Brown, P. F., S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer. 'The mathematics of statistical machine translation: Parameter estimation. Computational linguistics, Vol.19, pp. 263-312, 1993
Och, Franz Josef, Hermann Ney, 'A Systematic Comparison of Various Statistical Alignment Models,' Computational Linguistics, 29(1):19-51, 2003 https://doi.org/10.1162/089120103321337421
Choi, Yong-Seok, Ji-Ae Shin, Key-Sun Choi (2006), Identification of Boundaries in Parallel Noun Phrases: A Probabilistic Swapping Model, International Journal of Computer Processing of Oriental Languages, 19(2&3), 109-132 https://doi.org/10.1142/S0219427906001451
Choi, Key-Sun, Hee-Sook Bae, Procedures and Problems in Korean-Chinese-Japanese Wordnet with Shared Semantic Hierarchy, WordNet Conference, pp. 320-325, 2004.1, Brno, Czech
Yoon, Juntae, Key-Sun Choi, Mansuk Song 'Corpus-Based Approach for Nominal Compound Analysis for Korean Based on Linguistic and Statistical Information,' Natural Language Engineering vol 7/No 3, 251-270, 2001

한국정보과학회논문지:소프트웨어및응용 (Journal of KIISE:Software and Applications)

확률모형과 수식정보를 이용한 와/과 병렬사구 범위결정

Range Detection of Wa/Kwa Parallel Noun Phrase using a Probabilistic Model and Modification Information

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)