• Title/Summary/Keyword: Content/Function Words

Search Result 58, Processing Time 0.032 seconds

A Comparative Study of Feature Extraction Methods for Authorship Attribution in the Text of Traditional East Asian Medicine with a Focus on Function Words (한의학 고문헌 텍스트에서의 저자 판별 - 기능어의 역할을 중심으로 -)

  • Oh, Junho
    • Journal of Korean Medical classics
    • /
    • v.33 no.2
    • /
    • pp.51-59
    • /
    • 2020
  • Objectives : We would like to study what is the most appropriate "feature" to effectively perform authorship attribution of the text of Traditional East Asian Medicine Methods : The authorship attribution performance of the Support Vector Machine (SVM) was compared by cross validation, depending on whether the function words or content words, single word or collocations, and IDF weights were applied or not, using 'Variorum of the Nanjing' as an experimental Corpus. Results : When using the combination of 'function words/uni-bigram/TF', the performance was best with accuracy of 0.732, and the combination of 'content words/unigram/TFIDF' showed the lowest accuracy of 0.351. Conclusions : This shows the following facts from the authorship attribution of the text of East Asian traditional medicine. First, function words play an important role in comparison to content words. Second, collocations was relatively important in content words, but single words have more important meanings in function words. Third, unlike general text analysis, IDF weighting resulted in worse performance.

Word class information in perception of prosodic prominence by Korean learners of English

  • Im, Suyeon
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.1-8
    • /
    • 2019
  • This study aims to investigate how prosodic prominence is perceived in relation to word class information (or parts-of-speech) by Korean learners of English compared with native English speakers in public speech. Two groups, Korean learners of English and native English speakers, were asked to judge words perceived as prominent simultaneously while listening to a speech. Parts-of-speech and three acoustic cues (i.e., max F0, mean phone duration, and mean intensity) were analyzed for each word in the speech. The results showed that content words tended to be higher in pitch and longer in duration than function words. Both groups of listeners rated prominence on content words more frequently than on function words. This tendency, however, was significantly greater for Korean learners of English than for native English speakers. Among the parts-of-speech of the content words, Korean learners of English were more likely than native English speakers to judge nouns and verbs as prominent. This study presents evidence that Korean learners of English consider most, if not all, content words as landing locations of prosodic prominence, in alignment with the previous study on the production of prominence.

An analysis of listening errors by Korean EFL learners from self-paced passage dictation

  • Cho, Hyesun
    • Phonetics and Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.17-24
    • /
    • 2021
  • In this study, listening errors by Korean EFL learners are comprehensively analyzed from self-paced passage dictation tasks. Fifty-five Korean EFL learners participated in the study. Listeners were asked to write down dictation passages as accurately as possible, while listening to the audio as much as they needed. The results show that (i) low-proficiency learners tend to misperceive longer phrases than high-proficiency learners, (ii) function words are more often omitted or misheard than content words, and (iii) low-proficiency learners have more difficulties with content words than high-proficiency learners do. Most frequent suffix errors were omissions of past or plural suffixes. Among the function words, the most frequent errors were found with auxiliary contractions, infinitive marker to, and articles, mostly in the environment of linking and elision. It is also shown that C-V linking, C-C linking, and elision are the primary sources for the most frequent errors. C-V linking led to errors in correctly locating the word boundary, while C-C linking and elision resulted in omission. These errors show that Korean EFL listeners have difficulties in detecting fine-grained phonetic details to the extent that native speakers can do.

An evaluation of Korean students' pronunciation of an English passage by a speech recognition application and two human raters

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.19-25
    • /
    • 2020
  • This study examined thirty-one Korean students' pronunciation of an English passage using a speech recognition application, Speechnotes, and two Canadian raters' evaluations of their speech according to the International English Language Testing System (IELTS) band criteria to assess the possibility of using the application as a teaching aid for pronunciation education. The results showed that the grand average percentage of correctly recognized words was 77.7%. From the moderate recognition rate, the pronunciation level of the participants was construed as intermediate and higher. The recognition rate varied depending on the composition of the content words and the function words in each given sentence. Frequency counts of unrecognized words by group level and word type revealed the typical pronunciation problems of the participants, including fricatives and nasals. The IELTS bands chosen by the two native raters for the rainbow passage had a moderately high correlation with each other. A moderate correlation was reported between the number of correctly recognized content words and the raters' bands, while an almost a negligible correlation was found between the function words and the raters' bands. From these results, the author concludes that the speech recognition application could constitute a partial aid for diagnosing each individual's or the group's pronunciation problems, but further studies are still needed to match human raters.

A Study on the Function Words of Hwang je nae gyung-Somun (("황제내경소문(黃帝內經素問)" 허사연구(虛詞硏究))

  • Lee, Jae-Sun;Hwang, Woo-June;Lee, Si-Hyung;Keum, Kyeong-Su
    • Journal of the Korean Institute of Oriental Medical Informatics
    • /
    • v.14 no.1
    • /
    • pp.1-18
    • /
    • 2008
  • The elementary idea of 'function-words' in Classical Chinese originates from Han dynasty. But because of the pictographic nature the methodology for 'content words' had been applied to the study on 'function words', ane the conditions had not changed until modern times. In grammar functions of the function-word syntactical, morphological in unit sentence were studied in this using the method of quantitative analysis for all the function-words appeared in ${\ll}$Hwang je nae gyung-Somun${\gg}$. In the previous studies about function word, many data were collected and analyzed diachronically. But those studies failed to examine function-words closed in connection with synchronic study. Besides, in the explain about relevant function-words also, the case which was made centering around exegetic explain was most. And in the case to explain the function-words have in unit sentence also, the explain only about some function-words is made, but the analysis about concrete function to have in syntactical system is being handled negligently. This study stands not only on the background df the traditional studies but also on the basis of the western grammar and linguistics, especially the descriptive grammar and linguistics, especially the descriptive grammar. ${\ll}$Hwang je nae gyung-Somun${\gg}$ is collect and recorded the mythology and special contents related to Daoism in the side of contents as what was written on the basis of the historical consciousness of individual in contents regardless of compilation system. The purpose of this study is to clarify how the role and function of fuction-words are being made in the composition of unit sentence which appeard in ${\ll}$Hwang je nae gyung-Somun${\gg}$ through synchronic grammar system.

  • PDF

Lexical Sophistication Features to Distinguish the English Proficiency Level Using a Discriminant Function Analysis (판별분석을 통해 살펴본 영어 능력 수준을 구별하는 어휘의 정교화 특성)

  • Lee, Young-Ju
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.691-696
    • /
    • 2022
  • This study explored the lexical sophistication features to distinguish the group membership of English proficiency, using the automatic analysis program of lexical sophistication. A total of 600 essays written by 300 Korean college students were extracted from the ICNALE (International Corpus Network of Asian Learners of English) corpus and a discriminant function analysis was performed using SPSS program. Results showed that the lexical features to distinguish three groups of English proficiency are SUBTLEXUS frequency content words, age of acquisition content words, lexical decision mean reaction time function words, and hypernymy verbs. High-level Korean students used frequent content words from SUBTLEXUS corpus to a lesser degree and produced more sophisticated words that can be learned at a later age and take longer reaction time in lexical decision task, and more concrete verbs.

The Relationship between Lexical Sophistication Features and English Proficiency for Korean College Students using TAALES Program (TAALES 프로그램을 활용하여 한국 대학생이 작성한 에세이에 나타난 어휘의 정교화 특성 비교)

  • Lee, Young-Ju
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.3
    • /
    • pp.433-438
    • /
    • 2021
  • This study investigates the relationship between lexical sophistication features and English proficiency for Korean college students. Essays from the ICNALE(International Corpus Network of Asian Learners of English) corpus were analyzed, using TAALES program. In order to examine whether or not there are statistically significant differences in lexical sophistication features across three groups, MANOVA was conducted. Results showed that the lexical sophistication features were significantly affected by English proficiency level. Essays written by Korean students with different English proficiency levels can be differentiated in terms of various lexical sophistication features including content words frequency, content words familiarity, lexical decision mean reaction time function words, hypernymy verbs, word naming response time function words, age of acquisition content words.

The Realization and Perception of English Contrastive Focus -A Comparative Study between Native Speakers of English and Korean Learners of English- (영어 대조 초점의 발화와 인지에 관한 연구 - 원어민 화자와 한국인 화자의 실현 양상 비교 -)

  • Jun, Ji-Hyun;Song, Jae-Yung;Lee, Hyun-Jung;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.215-234
    • /
    • 2002
  • This study is designed for two purposes. The first one is to compare the realization and perception of English contrastive focus between Korean learners of English and native speakers of English. The second purpose is to study the phonological and phonetical features of contrastive focus by examining the results of production and perception experiments. English native speakers' results show that the English contrastive accents are generally accompanied by higher peak heights. The findings agree with the results of Bartels & Kingston (1994). Unlike native speakers of English, Korean speakers seem to be poor at relating the phonetical features of contrastive focus to their actual speech. Korean speakers' results are especially unsuccessful when the contrast is not distinctly grasped through syntactic structure, or when the function words are contrasted. Furthermore, Korean speakers' utterances tend to have pitch accents on every content word, whether the word is contrasted or not.

  • PDF

Analysis on Vocabulary Used in School Newsletters of Korean elementary Schools: Focus on the areas of Busan, Ulsan and Gyeongnam (한국 초등학교 가정통신문의 어휘 특성 연구 -부산·울산·경남 지역을 중심으로-)

  • Kang, Hyunju
    • Journal of Korean language education
    • /
    • v.29 no.2
    • /
    • pp.1-23
    • /
    • 2018
  • This study aims to analyze words and phrases which are frequently used in newsletters from Korean elementary schools. In order to achieve this goal, high frequent words from school newsletters were selected and classified into content and function words, and the domains of the words were looked up. For this study 1,000 school newsletters were collected in the areas of Busan, Ulsan and Gyeongnam. In terms of parts of speech, nouns, especially common nouns, most frequently appeared in the school newsletters followed by verbs and adjectives. This result shows that for immigrant women who have basic knowledge on Korean language, it is useful to give translated words to get the message of school letters. Furthermore, school related terms such as facilities, regulations and activities of school and Chinese-based vocabularies are found in school newsletters. In case of verbs, the words which contain the meaning of requests and suggestions are used the most. Adjectives which are related to positive value and evaluation, and describing weather and season is frequently used as well.

An Analysis on the Vocabulary in the English-Translation Version of Donguibogam Using the Corpus-based Analysis (코퍼스 분석방법을 이용한 『동의보감(東醫寶鑑)』 영역본의 어휘 분석)

  • Jung, Ji-Hun;Kim, Dong-Ryul;Kim, Do-Hoon
    • The Journal of Korean Medical History
    • /
    • v.28 no.2
    • /
    • pp.37-45
    • /
    • 2015
  • Objectives : A quantitative analysis on the vocabulary in the English translation version of Donguibogam. Methods : This study quantitatively analyzed the English-translated texts of Donguibogam with the Corpus-based analysis, and compared the quantitative results analyzing the texts of original Donguibogam. Results : As the results from conducting the corpus analysis on the English-translation version of Donguibogam, it was found that the number of total words (Token) was about 1,207,376, and the all types of used words were about 20.495 and the TTR (Type/Token Rate) was 1.69. The accumulation rate reaching to the high-ranking 1000 words was 83.54%, and the accumulation rate reaching to the high-ranking 2000 words was 90.82%. As the words having the high-ranking frequency, the function words like 'the, and of, is' mainly appeared, and for the content words, the words like 'randix, qi, rhizoma and water' were appeared in multi frequencies. As the results from comparing them with the corpus analysis results of original version of Donguibogam, it was found that the TTR was higher in the English translation version than that of original version. The compositions of function words and contents words having high-ranking frequencies were similar between the English translation version and the original version of Donguibogam. The both versions were also similar in that their statements in the parts of 'Remedies' and 'Acupuncture' showed higher composition rate of contents words than the rate of function words. Conclusions : The vocabulary in the English translation version of Donguibogam showed that this book was a book keeping the complete form of sentence and an Korean medical book at the same time. Meanwhile, the English translation version of Donguibogam had some problems like the unification of vocabulary due to several translators, and the incomplete delivery of word's meanings from the Chinese character-culture area to the English-culture area, and these problems are considered as the matters to be considered in a work translating Korean old medical books in English.