A Method for Clustering Noun Phrases into Coreferents for the Same Person in Novels Translated into Korean

Park, Taekeun;Kim, Seung-Hoon;

doi:10.9717/kmms.2017.20.3.533

Journal of Korea Multimedia Society (한국멀티미디어학회논문지)

Volume 20 Issue 3
/
Pages.533-542
/
2017
/
1229-7771(pISSN)
/
2384-0102(eISSN)

Korea Multimedia Society (한국멀티미디어학회)

DOI QR Code

A Method for Clustering Noun Phrases into Coreferents for the Same Person in Novels Translated into Korean

한국어 번역 소설에서 인물명 명사구의 동일인물 공통참조 클러스터링 방법

Park, Taekeun (Dept. of Applied Computer Engineering, Dankook University) ;
Kim, Seung-Hoon (Dept. of Applied Computer Engineering, Dankook University)

박태근 ;
김승훈

Received : 2016.12.12
Accepted : 2017.01.17
Published : 2017.03.30

https://doi.org/10.9717/kmms.2017.20.3.533 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Novels include various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author's pen and imagination. As a result, any proper noun dictionary cannot include all kinds of character names. In addition, the novels translated into Korean have character names consisting of two or more nouns (such as "Harry Potter"). In this paper, we propose a method to extract noun phrases for character names and to cluster the noun phrases into coreferents for the same character name. In the extraction of noun phrases, we utilize KKMA morpheme analyzer and CPFoAN character identification tool. In clustering the noun phrases into coreferents, we construct a directed graph with the character names extracted by CPFoAN and the extracted noun phrases, and then we create name sets for characters by traversing connected subgraphs in the directed graph. With four novels translated into Korean, we conduct a survey to evaluate the proposed method. The results show that the proposed method will be useful for speaker identification as well as for constructing the social network of characters.

Keywords

References

D.K. Elson and K.R. McKeown, "Automatic Attribution of Quoted Speech in Literary Narrative," Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp. 1013-1019, 2010.
D.K. Elson, N. Dames, and K.R. McKwown, "Extracting Social Networks from Literary Fiction," Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 138-147, 2010.
E. Iosif and T. Mishra, "From Speaker Identification to Affective Analysis: A Multi-Step System for Analyzing Children' Stories," Proceeding of the 3rd Workshop on Computational Linguistics for Literature, pp. 40-49, 2014.
Stanford CoreNLP-A Suite of Core NLP Tools, http://nlp.stanford.edu/software/corenlp.shtml, (accessed Nov., 28, 2016).
T. Park and S. H. Kim, "A Character Identification Method Using Postpositions for Animate Nouns in Korean Novels," Journal of Information Technology Services, Vol. 15, No. 3, pp. 115-125, 2016.
T. Park and S.H. Kim, "A Character Identification Method Utilizing Connective and Possessive Forms of Animate Nouns in Novels Translated into or Written in Korean," IEICE Transactions on Information and Systems, 2016.
E.Y. Lee, "Named Entity Detection and Relation Extraction in the Personal Chronology of the 19th Century," Journal of EONEOHAG, Vol. 53, pp. 141-162, 2009.
G.M. Park, S.H. Kim, and H.G. Cho, "Analysis of Social Network According to the Distance of Character Statements," Journal of the Korea Contents Association, Vol. 13, No. 4, pp. 427-439, 2013. https://doi.org/10.5392/JKCA.2013.13.04.427
B.H. Back, I. Ha, and B.C. Ahn, "An Extraction Method of Sentiment Information from Unstructured Big Data on SNS," Journal of Korea Multimedia Society, Vol. 17, No. 6, pp. 671-680, 2014. https://doi.org/10.9717/kmms.2014.17.6.671
D.J. Lee, J.H. Yeon, I.B. Hwang, and S.G. Lee, "KKMA: A Tool for Utilizing Sejong Corpus Based on Relational Database," Journal of KIISE: Computing Practices and Letters, Vol. 16, No. 11, pp. 1046-1050, 2010.

Journal of Korea Multimedia Society (한국멀티미디어학회논문지)

A Method for Clustering Noun Phrases into Coreferents for the Same Person in Novels Translated into Korean

한국어 번역 소설에서 인물명 명사구의 동일인물 공통참조 클러스터링 방법

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)