• Title/Summary/Keyword: speech analysis

Search Result 1,587, Processing Time 0.036 seconds

Perceptive evaluation of Korean native speakers on the polysemic sentence final ending produced by Chinese Korean learners (KFL중국인학습자들의 한국어 동형다의 종결어미 발화문에 대한 원어민화자의 지각 평가 양상)

  • Yune, Youngsook
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.27-36
    • /
    • 2020
  • The aim of this study is to investigate the perceptive aspects of the polysemic sentence final ending "-(eu)lgeol" produced by Chinese Korean learners. "-(Eu)lgeol" has two different meanings, that is, a guess and a regret, and these different meanings are expressed by the different prosodic features of the last syllable of "-(eu)lgeol". To examine how Korean native speakers perceive "-(eu)lgeol" sentences produced by Chinese Korean learners and the most saliant prosodic variable for the semantic discrimination of "-(eu)lgeol" at the perceptive level, we performed a perceptual experiment. The analysed material constituted four Korean sentences containing "-(eu)lgeol" in which two sentences expressed guesses and the other two expressed regret. Twenty-five Korean native speakers participated in the perceptual experiment. Participants were asked to mark whether "-(eu)lgeol" sentences they listened to were (1) definitely regrets, (2) probably regrets, (3) ambiguous, (4) probably guesses, or (5) definitely guesses based on the prosodic features of the last syllable of "-(eu)lgeol". The analysed prosodic variables were sentence boundary tones, slopes of boundary tones, pitch difference between sentence-final and penultimate syllables, and pitch levels of boundary tones. The results show that all the analysed prosodic variables are significantly correlated with the semantic discrimination of "-(eu)lgeol" and among these prosodic variables, the most salient role in the semantic discrimination of "-(eu)lgeol" is pitch difference between sentence-final syllable and penultimate syllable.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

Identification and Clinical Implications of Novel MYO15A Mutations in a Non-consanguineous Korean Family by Targeted Exome Sequencing

  • Chang, Mun Young;Kim, Ah Reum;Kim, Nayoung K.D.;Lee, Chung;Lee, Kyoung Yeul;Jeon, Woo-Sung;Koo, Ja-Won;Oh, Seung Ha;Park, Woong-Yang;Kim, Dongsup;Choi, Byung Yoon
    • Molecules and Cells
    • /
    • v.38 no.9
    • /
    • pp.781-788
    • /
    • 2015
  • Mutations of MYO15A are generally known to cause severe to profound hearing loss throughout all frequencies. Here, we found two novel MYO15A mutations, c.3871C>T (p.L1291F) and c.5835T>G (p.Y1945X) in an affected individual carrying congenital profound sensorineural hearing loss (SNHL) through targeted resequencing of 134 known deafness genes. The variant, p.L1291F and p.Y1945X, resided in the myosin motor and IQ2 domains, respectively. The p.L1291F variant was predicted to affect the structure of the actin-binding site from three-dimensional protein modeling, thereby interfering with the correct interaction between actin and myosin. From the literature analysis, mutations in the N-terminal domain were more frequently associated with residual hearing at low frequencies than mutations in the other regions of this gene. Therefore we suggest a hypothetical genotype-phenotype correlation whereby MYO15A mutations that affect domains other than the N-terminal domain, lead to profound SNHL throughout all frequencies and mutations that affect the N-terminal domain, result in residual hearing at low frequencies. This genotype-phenotype correlation suggests that preservation of residual hearing during auditory rehabilitation like cochlear implantation should be intended for those who carry mutations in the N-terminal domain and that individuals with mutations elsewhere in MYO15A require early cochlear implantation to timely initiate speech development.

School Phonetics and How to Teach Prosody of English in Japan

  • Tsuzuki, Masaki
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.11-25
    • /
    • 1997
  • This presentation will focus on building basic English Prosodic Skills which are very useful and helpful for Japanese learners of English. The focus first will be on recognizing the seven basic nuclear tones, analysing intonation structures, distinguishing intonation patterns and then on the way of improving speaking ability using sufficient verbal contents of intonation (mini-dialogue). My presentation deals mainly with some difficulties which Japanese learners of English have in the field of RP intonation, It is chiefly concerned with identifying, describing and analysing tone-group sequences. It sometimes happens that Japanese learners of English can pronounce isolated bounds correctly and read phonetic symbols sufficiently, bet have difficult problems in carrying out accurate prosodic features. The use of wrong intonation is sometimes the cause of misunderstanding of speaker's attitude, connotation and shades of meaning, etc.. However accurately students can pronounce the nuclear tone or tone-group of English, they have to learn how to connect tone-groups properly for suitable sequences in respect to meaning or implication. We are faced with the complicated theory of RF intonation on the one hand and difficult realization of it on the other. Japanese learners of English have special difficulties in employing "rising tune" and "falling + rising tune". If students are taught pitch movements by indicating dots graphically between two horizontal lines, they can easily understand the whole shape of pitch movements. In this presentation, I illuminate several tone-group sequences which are very useful for Japanese learning English intonation. Among them, four similar Pitch Patterns, such as, (1) (equation omitted)- type, (2) (equation omitted) - type, (3) (equation omitted) - type and (4) (Rising Head) (equation omitted)- type are clarified and other important tone-group sequences aye also highlighted from the point of view of teaching English as a foreign language. The intonation theory, tone marks and technical terms are, in all essentials, those of Intonation of Colloquial English by O'Connor, J. D. and Arnold, G. F., Longman, 2nd ed., 1982. The changes of tone are shown graphically between two horizontal lines representing the ordinary high and low zones of the utterance. A.C.Gimson (1981:314) : The intonation of English has been studied in greater detail and for longer than that of any other language. No definitive analysis, classifying the features of RP intonation, has yet appeared (though that presented by O'Connor and Arnold (1973) provides the most comprehensive and useful account from the foreign learner's point of view).

  • PDF

A Case Study on the Experience of Science Teacher Participating in Peer Coaching Meetings (동료 장학 모임에 참여한 과학교사의 경험 사례 연구)

  • Chung, Haengnam;Choi, Byungsoon
    • Journal of The Korean Association For Science Education
    • /
    • v.33 no.1
    • /
    • pp.63-78
    • /
    • 2013
  • Purposes of this study were to explore the process of experience that science teachers go through when participating in peer coaching meetings to improve teaching ability and to find out factors that affect each process of experience. The data were collected through recording of peer coaching meetings, videotapes of science class, and interviews. All the data were analyzed after transcription. The results of the study showed that even though Teacher K broke the ice and formed consensus among the peers by developing Content Representation (CoRe) at the beginning of the meetings, he became self-defensive rather than receptive of peers' opinions on the recorded class at the discussion session. But as the peer coaching went on, he realized that peer coaching was not about evaluation but rather on improving his teaching ability. In turn, he was able to look at his teaching in a more objective point of view and accepted suggestions from peer coaching discussion. The self-reflection of Teacher K acted as the key factor in the efforts to improve his teaching ability. He sought the concrete alternatives through the class analysis with fellow teachers and showed major changes in his teaching practice from the language habits, pronunciation, and speed of his speech to the interaction with students and class design. However, there was little change in knowledge of curriculum and assessment due to his strong orientation to improve students' grades as an academic high school teacher. Likewise, it was found that while peer coaching exert a strong influence on instructional methods and strategies of Teacher K, his strong orientation to improve students' grades hinders a balanced development of subcomponents of PCK.

Phonatory Caracteristics of Vwels and Resonant Consonants using the Electroglottography (전기성문파형검사를 이용한 모음과 공명 자음의 발성특성)

  • Choi, Seong-Hee;Nam, Do-Hyun;Lim, Jae-Yol;Lim, Sung-Eun;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.2
    • /
    • pp.133-140
    • /
    • 2004
  • Background and Objectives : Vowels and resonant including nasals and liquid are produced with vocal folds vibration have been used for voice therapy of hyperadduction patients. This study was conducted to investigate phonatory characteristics of vowels and resonant consonants through the EGG measures from Lx. Speech studio (Laryngograph Ltd, UK). Materials and Method : 7 male adults produced sustained vowel /a/, /i/, /u/, nasals /m/, /n/, /${\eta}$/and liquid /I/ and read the sentences (1nasals-liquid sentence, 1 non-nasals-liquid sentence) and tongue-tip trill and humming. Fx(Hz), Ox(%) were obtained of vowels, nasals, liquid and each of the posterior vowel /a/ of /ma/, /na/, /la/, /ha/ with same F0(around F#165Hz) and amplitude (75${\pm}$5db). And also DFx(Hz), DQx(%), CFx(%) and CAx(%) were obtained from reading two kinds of sentences. Results : Qx(%) was the highest in /u/ of vowels, and nasal/n/ of the resonant consonants and nasals-liquid sentence was higher Qx than non-nasals-liquid sentence but significant differences were not found. Qx(%) of the posterior vowel /a/ of nasal consonants/n/ was higher than in the isolated vowel/a/ and other posterior vowel of resonant consonants and fricatives /h/. Regularity or periodicity and higher Qx were observed in the nasals-liquid sentence than non-nasals-liquid sentence in graphs of QxFx & CFx produced by Quantiative analysis. In the nasalance score, /u/vowel was significant higher among the vowels and /I/ liquid was significant lower among the resonant consonants and nasals-liquid sentence is higher than non-nasals -liquid sentence. CQ(%) was not significantly correlated with nasalance(%). Conclusion : These findings might signify resonant phonation was not correlated with nasalance.

  • PDF

A STUDY OF PSYCHOSOCIAL VARIABLES WITHIN ADHD WITH OR WITHOUT EXTERNALIZING SYMPTOM (ADHD 아동과 외면화 증후를 공존질환으로 갖는 ADHD 아동간의 심리사회적 변인에 관한 비교연구)

  • Lee, Kyung-Sook;Ryu, Yoon-Jung;Ahn, Dong-Hyun;Shin, Yee-Jin
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.7 no.2
    • /
    • pp.203-212
    • /
    • 1996
  • In this thesis, we investigate the psychosocial variables within the family environments of Attention Deficit Hyperactivity Disorder with (ADHD+CD/ODD) or without (ADHD) externalizing symptoms. The subjects in this research were 86 boys and girls(aged range 6 to 14 years) consisted of 20 ADHD, 22 comorbid ADHD(ADHD+CD/ODD) and 44 normal control group(NC). We have collected data on children and their mothers. The psychosocial variables included in the analysis are socioeconomic status, parent's educational level, life stress event, and the rate of psychiatric disorders in relatives. The self-reported questionnaires marital discord(MAS), parenting stress(PSI), and parenting attitute(MBRI) completed by mothers. The results indicated that ADHD+CD/ODD is supposed to have higher level of family adversity suggested by the lower SES, lower parental educational level, higher life stress events, and more psychic disorders in relatives compared with ADHD or normal control group. In MAS, ADHD+CD/ODD group has significantly the lowest scores on each factor of the measure of marital adjustment. Parents of ADHD+CD/ODD are much more likely to have positive parenting stress when compared with the parents of ADHD. Especially, mothers of ADHD+CD/ODD have the lowest tendency in the mean score on affective, accepted attitudd. In an inapropriate parenting attitude perceived by children, father of ADHD+CD/ODD have the most negative, contradictory attitude and mothers of ADHD+CD/ODD have the most restrictive, negative and contradictory attitude.

  • PDF

A Study on Spoken Digits Analysis and Recognition (숫자음 분석과 인식에 관한 연구)

  • 김득수;황철준
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.6 no.3
    • /
    • pp.107-114
    • /
    • 2001
  • This paper describes Connected Digit Recognition with Considering Acoustic Feature in Korea. The recognition rate of connected digit is usually lower than word recognition. Therefore, speech feature parameter and acoustic feature are employed to make robust model for digit, and we could confirm the effect of Considering. Acoustic Feature throughout the experience of recognition. We used KLE 4 connected digit as database and 19 continuous distributed HMM as PLUs(Phoneme Like Units) using phonetical rules. For recognition experience, we have tested two cases. The first case, we used usual method like using Mel-Cepstrum and Regressive Coefficient for constructing phoneme model. The second case, we used expanded feature parameter and acoustic feature for constructing phoneme model. In both case, we employed OPDP(One Pass Dynamic Programming) and FSA(Finite State Automata) for recognition tests. When appling FSN for recognition, we applied various acoustic features. As the result, we could get 55.4% recognition rate for Mel-Cepstrum, and 67.4% for Mel-Cepstrum and Regressive Coefficient. Also, we could get 74.3% recognition rate for expanded feature parameter, and 75.4% for applying acoustic feature. Since, the case of applying acoustic feature got better result than former method, we could make certain that suggested method is effective for connected digit recognition in korean.

  • PDF

Language performance analysis based on multi-dimensional verbal short-term memories in patients with conduction aphasia (다차원 구어 단기기억에 따른 전도 실어증 환자의 언어수행력 분석)

  • Ha, Ji-Wan;Hwang, Yu Mi;Pyun, Sung-Bom
    • Korean Journal of Cognitive Science
    • /
    • v.23 no.4
    • /
    • pp.425-455
    • /
    • 2012
  • Multi-dimensional verbal short-term memory mechanisms are largely divided into the phonological channel and the lexical-semantic channel. The former is called phonological short-term memory and the latter is called semantic short-term memory. Phonological short-term memory is further segmented into the phonological input buffer and the phonological output buffer. In this study, the language performance of each of three patients with similar levels of conduction aphasia was analyzed in terms of multi-dimensional verbal short-term memory. To this end, three patients with conduction aphasia were instructed to perform four different aspects of language tasks that are spontaneous speaking, repetition, spontaneous writing, and dictation in both word and sentence level. Moreover, the patients' phonological memories and semantic short-term memories were evaluated using digit span tests and verbal learning tests. As a result, the three subjects exhibited various types of performances and error responses in the four aspects of language tests, and the short-term memory tests also did not produce identical results. The language performance of three patients with conduction aphasia can be explained according to whether the defects occurred in the semantic short-term memory, phonological input buffer and/or phonological output buffer. In this study, the relations between language and multi-dimensional verbal short-term memory were discussed based on the results of language tests and short-term memory tests in patients with conduction aphasia.

  • PDF