• 제목/요약/키워드: Corpus-based Analysis

검색결과 200건 처리시간 0.027초

코퍼스에 기반한 문학텍스트 분석 (Corpus-Based Literary Analysis)

  • 하명정
    • 한국콘텐츠학회논문지
    • /
    • 제13권9호
    • /
    • pp.440-447
    • /
    • 2013
  • 코퍼스 언어학이 연구방법의 한 분야로서 최근 그 입지를 급격하게 넓혀온 가운데, 언어학적 현상과 함께 문학텍스트의 이해를 깊게 하는데 기여를 해 왔다. 최근 코퍼스 언어학의 급속한 저변확대에도 불구하고 문학텍스트 코퍼스를 기반으로 한 고전 및 문학작품의 재해석에 대한 시도는 국내언어학계에서 매우 미미한 실정에 머물러 있다. 이에 본 연구는 코퍼스 언어학의 분석도구인 컴퓨터 콘코던스 프로그램인 워드스미스를 이용하여 방대한 전자텍스트로 이루어져 있는 문학작픔의 문체적 특성과 주요테마를 조사하고자 하였다. 특히 본 연구는 텍스트의 주요한 특성을 나타내는 키워드(keyword)에 초점을 두고 세익스피어의 비극작품인 로미오와 줄리엣을 코퍼스 언어학적 분석기법으로 접근하여 작품세계를 재조명하여 학문적 의의가 크다고 생각되며 앞으로 관련된 후속연구가 이어질 것으로 기대된다.

A Corpus-based Lexical Analysis of the Speech Texts: A Collocational Approach

  • Kim, Nahk-Bohk
    • 영어어문교육
    • /
    • 제15권3호
    • /
    • pp.151-170
    • /
    • 2009
  • Recently speech texts have been increasingly used for English education because of their various advantages as language teaching and learning materials. The purpose of this paper is to analyze speech texts in a corpus-based lexical approach, and suggest some productive methods which utilize English speaking or writing as the main resource for the course, along with introducing the actual classroom adaptations. First, this study shows that a speech corpus has some unique features such as different selections of pronouns, nouns, and lexical chunks in comparison to a general corpus. Next, from a collocational perspective, the study demonstrates that the speech corpus consists of a wide variety of collocations and lexical chunks which a number of linguists describe (Lewis, 1997; McCarthy, 1990; Willis, 1990). In other words, the speech corpus suggests that speech texts not only have considerable lexical potential that could be exploited to facilitate chunk-learning, but also that learners are not very likely to unlock this potential autonomously. Based on this result, teachers can develop a learners' corpus and use it by chunking the speech text. This new approach of adapting speech samples as important materials for college students' speaking or writing ability should be implemented as shown in samplers. Finally, to foster learner's productive skills more communicatively, a few practical suggestions are made such as chunking and windowing chunks of speech and presentation, and the pedagogical implications are discussed.

  • PDF

A Study on the Diachronic Evolution of Ancient Chinese Vocabulary Based on a Large-Scale Rough Annotated Corpus

  • Yuan, Yiguo;Li, Bin
    • 아시아태평양코퍼스연구
    • /
    • 제2권2호
    • /
    • pp.31-41
    • /
    • 2021
  • This paper makes a quantitative analysis of the diachronic evolution of ancient Chinese vocabulary by constructing and counting a large-scale rough annotated corpus. The texts from Si Ku Quan Shu (a collection of Chinese ancient books) are automatically segmented to obtain ancient Chinese vocabulary with time information, which is used to the statistics on word frequency, standardized type/token ratio and proportion of monosyllabic words and dissyllabic words. Through data analysis, this study has the following four findings. Firstly, the high-frequency words in ancient Chinese are stable to a certain extent. Secondly, there is no obvious dissyllabic trend in ancient Chinese vocabulary. Moreover, the Northern and Southern Dynasties (420-589 AD) and Yuan Dynasty (1271-1368 AD) are probably the two periods with the most abundant vocabulary in ancient Chinese. Finally, the unique words with high frequency in each dynasty are mainly official titles with real power. These findings break away from qualitative methods used in traditional researches on Chinese language history and instead uses quantitative methods to draw macroscopic conclusions from large-scale corpus.

코퍼스를 기반으로 한 어휘 과제가 고등학생의 영어 어휘 학습과 태도에 미치는 영향 (The effects of corpus-based vocabulary tasks on high school students' English vocabulary learning and attitude)

  • 이현진;이은주
    • 영어어문교육
    • /
    • 제16권4호
    • /
    • pp.239-265
    • /
    • 2010
  • This study investigates the effects of corpus-based vocabulary tasks on the acquisition of English vocabulary in an attempt to explore the influence of corpus use on EFL pedagogy. For this to be realized, a total of 40 Korean high school students participated in the study over a 4-week period. An experimental group used a set of corpus-based tasks for vocabulary learning, whereas a control group carried out a traditional task (i.e., the L1-L2 translation) for vocabulary learning. To assess learning gains, the students were asked to complete the pre- and post-treatment tests measuring the word form, meaning, and use aspects of target lexical items. Results of the study indicate that in the experimental group the corpus-based vocabulary tasks were beneficial for the learning of word forms and use. In particular, corpus-based benefits were greatest in the low-proficiency EFL learners' collocational aspects of vocabulary use. On the other hand, in the control group, the traditional vocabulary tasks benefited the meaning aspects of target vocabulary items the most. In addition, survey results revealed that most students were positive about the corpus-based learning experience although some expressed reservations about the heavy cognitive load and the time-consuming nature of the analysis of corpus data primarily due to learners' lack of language proficiency.

  • PDF

'Because of Doing' and 'Because of Happening': A Corpus-based Analysis of Korean Causal Conjunctives, -nula(ko) and -nun palamey

  • Oh, Sang-Suk
    • 한국언어정보학회지:언어와정보
    • /
    • 제8권2호
    • /
    • pp.131-147
    • /
    • 2004
  • the two Korean causal conjunctive suffixes, -nula(ko) and -nun palamey, based on corpus linguistic analysis. Many of the linguistic accounts available, both in pedagogical reference and in the literature on linguistics, provide incomplete analyses of these suffixes, based on fabricated linguistic data. Using naturally occurring, real linguistic data, this paper examines the syntactic and semantic structures of the two causal suffixes through a consideration of three areas of corpus linguistic analysis: token frequencies, collocations, and semantic prosody. An analysis based on concordance data reveals that the two causal connectives, -nula(ko) and -nun palamey, have more differences than similarities in terms of syntactic and semantic constraints. The idiosyncratic structures of the two suffixes are discussed in terms of same subject condition, verb selection, same agent condition, synchronicity condition, and negative semantic prosody.

  • PDF

A Corpus-Based Study on Language Features and Literary Themes in the Yellow Wall-Paper and Herland by Charlotte Perkins Gilman

  • Lu, Hui-Chuan;Liu, Kai-Ling;Yeh, Chien-Ting;Chen, Ya-Jie
    • 아시아태평양코퍼스연구
    • /
    • 제3권1호
    • /
    • pp.21-34
    • /
    • 2022
  • This study aims to apply corpus-based approach to analyze The Yellow Wall-Paper and Herland written by Charlotte Perkins Gilman, a women's rights activist in the late nineteenth-century America. Although both works have attracted feminists' attention to the woman question that concerned Gilman, discussion on her language features and their relation to the literary themes of these two works is still in need. In this corpus-based analysis, we argue that the main themes of different literary works can be revealed through linguistic patterns identified by number and gender features of nouns and pronouns in the contrast of two works and a balanced corpus. The linguistic features (number and gender) have been related with two themes, the 'group and individual' and the 'feminine and masculine', and are further interpreted in terms of mothering and feminine consciousness. By adopting linguistic approach, our study provides quantitative and qualitative evidence to verify the established themes and arguments of these literary texts.

『동의보감사전』 편찬을 위한 표제어 추출에 관한 연구 - 코퍼스 분석방법을 바탕으로 - (Study on Extraction of Headwords for Compilation of 「Donguibogam Dictionary」 - Based on Corpus-based Analysis -)

  • 정지훈;김도훈;김동율
    • 한국의사학회지
    • /
    • 제29권1호
    • /
    • pp.47-54
    • /
    • 2016
  • This article attempts to extract headwords for complication of "Donguibogam Dictionary" with Corpus-based Analysis. The computerized original text of Donguibogam is changed into a text file by a program 'EM Editor'. Chinese characters of high frequency of exposure among Chinese characters of Donguibogam are extracted by a Corpus-based analytical program 'AntConc'. Two-syllable, three-syllable, four-syllable, and five-syllable words including each Chinese characters of high frequency are extracted through n-cluster, one of functions of AntConc. Lastly, The output that is meaningful as a word is sorted. As a result, words that often appear in Donguibogam can be sorted in this article, and the names of books, medical herbs, disease symptoms, and prescriptions often appear especially. This way to extract headwords by this Corpus-based Analysis can suggest better headwords list for "Donguibogam Dictionary" in the future.

A Corpus Analysis of Temporal Adverbs and Verb Tenses Cooccurrence in Spanish, English, and Chinese

  • Cheng, An Chung;Lu, Hui-Chuan
    • 아시아태평양코퍼스연구
    • /
    • 제3권2호
    • /
    • pp.1-16
    • /
    • 2022
  • This study investigates the cooccurrence between temporal adverbs and grammatical tenses in Spanish and contrasts temporal specifications across Spanish, English, and Chinese. Based on a monolingual Spanish corpus and a trilingual parallel corpus, the study identified the top ten frequent single-word temporal adverbs collocating with grammatical tenses in Spanish. It also contrasted the cooccurrence of temporal adverbs and verb tenses in three languages. The results show that aun 'still', hoy 'today', and ahora 'now' collocate with the present tense at more than 80%. Ayer 'yesterday' and finalmente 'finally' cooccurring with the simple past tense are at 84% and 69%, respectively. Then, mientras 'meanwhile' collocates with the past imperfect at 55%, the highest of all. Mañana 'tomorrow' cooccurs with the future and present tenses at 34%. Other adverbs, ya 'already', siempre 'always', and nuevamete 'again', do not present a strong cooccurrence tendency with a tense overall. The contrastive analysis of the trilingual parallel corpus shows a comprehensive view of temporal specifications in the three languages. However, no clear one-to-one mapping pattern of the cooccurrence across the three languages can be concluded, which provides helpful insights for second language instruction with natural language data rather than intuition. Future research with larger corpora is needed.

Modal Auxiliary Verbs in Japanese EFL Learners' Conversation: A Corpus-based Study

  • Nakayama, Shusaku
    • 아시아태평양코퍼스연구
    • /
    • 제2권1호
    • /
    • pp.23-34
    • /
    • 2021
  • This research examines Japanese non-native speakers' (JNNS) modal auxiliary verb use from two different perspectives: frequency of use and preferences for modalities. Additionally, error analysis is carried out to identify errors in modal use common among JNNSs. Their modal use is compared to that of English native speakers within a spoken dialogue corpus which is part of the International Corpus Network of Asian Learners' English. Research findings show at a statistically significant level that when compared to native speakers, JNNSs underuse past forms of modals and infrequently convey epistemic modality, indicating the possibility that JNNSs fail to express their opinions or thoughts indirectly when needed or to convey politeness appropriately. Error analysis identifies the following three types of common errors: (1) the use of incorrect tenses of modal verb phrases, (2) the use of inflected verb forms after modals, and (3) the non-use of main verbs after modals. The first type of error is largely because JNNSs do not master how to express past meanings of modals. The second and third types of errors seem to be due to first language transfer into second language acquisition and JNNSs' overgeneralization of the subject-verb agreement rules to modals respectively.

한국어 대용량발화말뭉치의 단모음분석 (Monophthong Analysis on a Large-scale Speech Corpus of Read-Style Korean)

  • 윤태진;강윤정
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.139-145
    • /
    • 2014
  • The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. 'Read Style Corpus of Standard Korean' is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal formant ceiling. The results of the vowel analysis are reliable and comparable to the results obtained using traditional analytical methods. The findings indicate that the methods adopted for the analysis can be extended and be used for more fine-grained analysis without time-consuming manual labeling without losing accuracy and reliability.