• 제목/요약/키워드: speech features

검색결과 648건 처리시간 0.021초

Acoustic Measurement of English read speech by native and nonnative speakers

  • Choi, Han-Sook
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.77-88
    • /
    • 2011
  • Foreign accent in second language production depends heavily on the transfer of features from the first language. This study examines acoustic variations in segments and suprasegments by native and nonnative speakers of English, searching for patterns of the transfer and plausible indexes of foreign accent in English. The acoustic variations are analyzed with recorded read speech by 20 native English speakers and 50 Korean learners of English, in terms of vowel formants, vowel duration, and syllabic variation induced by stress. The results show that the acoustic measurements of vowel formants and vowel and syllable durations display difference between native speakers and nonnative speakers. The difference is robust in the production of lax vowels, diphthongs, and stressed syllables, namely the English-specific features. L1 transfer on L2 specification is found both at the segmental levels and at the suprasegmental levels. The transfer levels measured as groups and individuals further show a continuum of divergence from the native-like target. Overall, the eldest group, students who are in the graduate schools, shows more native-like patterns, suggesting weaker foreign accent in English, whereas the high school students tend to involve larger deviation from the native speakers' patterns. Individual results show interdependence between segmental transfer and prosodic transfer, and correlation with self-reported proficiency levels. Additionally, experience factors in English such as length of English study and length of residence in English speaking countries are further discussed as factors to explain the acoustic variation.

  • PDF

안면 움직임 분석을 통한 단음절 음성인식 (Monosyllable Speech Recognition through Facial Movement Analysis)

  • 강동원;서정우;최진승;최재봉;탁계래
    • 전기학회논문지
    • /
    • 제63권6호
    • /
    • pp.813-819
    • /
    • 2014
  • The purpose of this study was to extract accurate parameters of facial movement features using 3-D motion capture system in speech recognition technology through lip-reading. Instead of using the features obtained through traditional camera image, the 3-D motion system was used to obtain quantitative data for actual facial movements, and to analyze 11 variables that exhibit particular patterns such as nose, lip, jaw and cheek movements in monosyllable vocalizations. Fourteen subjects, all in 20s of age, were asked to vocalize 11 types of Korean vowel monosyllables for three times with 36 reflective markers on their faces. The obtained facial movement data were then calculated into 11 parameters and presented as patterns for each monosyllable vocalization. The parameter patterns were performed through learning and recognizing process for each monosyllable with speech recognition algorithms with Hidden Markov Model (HMM) and Viterbi algorithm. The accuracy rate of 11 monosyllables recognition was 97.2%, which suggests the possibility of voice recognition of Korean language through quantitative facial movement analysis.

한국인 영어학습자의 영어리듬구현 연구 (A Study on the Rhythm of Korean EFL Learners' English Pronunciation)

  • 정현성
    • 말소리와 음성과학
    • /
    • 제1권2호
    • /
    • pp.141-149
    • /
    • 2009
  • An emphasis on teaching suprasegmental features of English, specifically English rhythm, is essential in order to improve the 'intelligibility' of the pronunciation of Korean EFL learners among interlocutors who use English as a Lingua Franca(ELF). By redefining the ELF suggested by Jenkins (2000, 2002), this paper argues that Lingua Franca Core (LFC) must include suprasegmental features such as 'stress-based rhythm' and word stress. However, because 'isochrony' is difficult to measure in a foot, the rhythm unit must be expanded to an intonational phrase which has prominence in it and the rhythm of the unit can be measured by calculating the duration of each segment in context The rhythmic pattern of Korean learners of English and that of native speakers or other non-native English speakers can then be calculated and compared by using correlation coefficients of the segmental duration. In terms of sociolinguistic factors, improving the 'comprehensibility' and 'accentedness' of Korean EFL learners' pronunciation is also important in international communication, which calls for more emphasis on suprasegmental features.

  • PDF

A Study on Korean Students' Production and Perception of English Word-final Stop Voicing

  • Kang, Seok-Han
    • 음성과학
    • /
    • 제14권1호
    • /
    • pp.105-119
    • /
    • 2007
  • The purpose of this study is to examine Korean students' production and perception of word-final stop voicing in light of their overseas experience. Subjects were English native speakers, Korean university students with residence experience in America, Korean university students without residence experience in America, and Korean elementary school students. They participated in both production and perception tests. Results showed that the students' production and perception with residence experience in America appeared quite similar to those of the English native speakers. In the production tests, we noticed somewhat different results in temporal and frequency features. The one-year residence in America had some influence on their frequency features, but not the temporal features in the word final stop production. That difference could be seen in the perception tests, too. We could not find any difference in the identification test of the final release environment between the Korean university students who had studied abroad and those who didn't. Rather the difference could be found in the cue influence test in both the final release and non-release environments.

  • PDF

한국어 특성과 CRFs를 이용한 자동 띄어쓰기 시스템 (Automatic Word Spacing for Korean Using CRFs with Korean Features)

  • 이현우;차정원
    • 대한음성학회지:말소리
    • /
    • 제65호
    • /
    • pp.125-141
    • /
    • 2008
  • In this work, we propose an automatic word spacing system for Korean using conditional random fields (CRFs) with Korean features. We map a word spacing problem into a classification problem in our work. We build a basic system which uses CRFs and Eumjeol bigram. After then, we analyze the result of inner-test. We extend a basic system added by some Korean features which are Josa, Eomi and two head Eumjeols of word extracting from lexicon. From the results of experiment, we can see that the proposed method is better than previous methods. Additionally the proposed method will be able to use mobile and speech applications because of very small size of model.

  • PDF

깊은 신경망 특징 기반 화자 검증 시스템의 성능 비교 (Performance Comparison of Deep Feature Based Speaker Verification Systems)

  • 김대현;성우경;김홍국
    • 말소리와 음성과학
    • /
    • 제7권4호
    • /
    • pp.9-16
    • /
    • 2015
  • In this paper, several experiments are performed according to deep neural network (DNN) based features for the performance comparison of speaker verification (SV) systems. To this end, input features for a DNN, such as mel-frequency cepstral coefficient (MFCC), linear-frequency cepstral coefficient (LFCC), and perceptual linear prediction (PLP), are first compared in a view of the SV performance. After that, the effect of a DNN training method and a structure of hidden layers of DNNs on the SV performance is investigated depending on the type of features. The performance of an SV system is then evaluated on the basis of I-vector or probabilistic linear discriminant analysis (PLDA) scoring method. It is shown from SV experiments that a tandem feature of DNN bottleneck feature and MFCC feature gives the best performance when DNNs are configured using a rectangular type of hidden layers and trained with a supervised training method.

감성 인식을 위한 강화학습 기반 상호작용에 의한 특징선택 방법 개발 (Reinforcement Learning Method Based Interactive Feature Selection(IFS) Method for Emotion Recognition)

  • 박창현;심귀보
    • 제어로봇시스템학회논문지
    • /
    • 제12권7호
    • /
    • pp.666-670
    • /
    • 2006
  • This paper presents the novel feature selection method for Emotion Recognition, which may include a lot of original features. Specially, the emotion recognition in this paper treated speech signal with emotion. The feature selection has some benefits on the pattern recognition performance and 'the curse of dimension'. Thus, We implemented a simulator called 'IFS' and those result was applied to a emotion recognition system(ERS), which was also implemented for this research. Our novel feature selection method was basically affected by Reinforcement Learning and since it needs responses from human user, it is called 'Interactive feature Selection'. From performing the IFS, we could get 3 best features and applied to ERS. Comparing those results with randomly selected feature set, The 3 best features were better than the randomly selected feature set.

An Application of Announcing techniques to the teaching of speech for non-native speakers of Japanese

  • Tomoko Shimoda
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.168-168
    • /
    • 1996
  • In this paper I will examine some concrete examples of the obstacles faced by non-native speakers of Japanese when learning the language. I will go on to suggest ways in which these obstacles may be overcome. Nowadays there are numerous Japanese language books available for non-native speakers. However, most of these introductory Japanese language books focus on topics such as pronunciation, accent and intonation. Notable, these introductory textbooks provide insufficient emphasis on prosodic features of the Japanese language. The Japanese language has been considered by many teachers as relatively easy compared to other languages, due to its simple phonetic structure. This may be a partial explanation of the reason why the teaching of prosodic features has generally been given insufficient emphasis. To teach Japanese efficiently at a university level I have combined an emphasis on the teaching of prosodic features together with my experience of television announcing. This has entailed using television news programmes and contemporary reading materials in my class. Using taped material I intend to describe a case-study of teaching of Japanese articulation.

  • PDF

성대용종 환자의 후두미세수술 전후 공기역학 변수 변화 (Aerodynamic features in patients with vocal polyps before & after laryngomicrosurgery)

  • 강영애;장재원;구본석
    • 말소리와 음성과학
    • /
    • 제8권3호
    • /
    • pp.39-49
    • /
    • 2016
  • The present study examined the change of aerodynamic features after laryngomicrosurgery in patients with vocal polyps. Aerodynamic evaluation was performed in thirty-nine patients (15 males and 24 females) one week before surgery and four weeks after surgery. Evaluation protocols of vital capacity, maximum sustained phonation(MXPH), and voicing efficiency(VOFT) were used to collect 29 phonatory aerodynamic measures, requiring voice with a comfortable pitch and loudness. Statistically significant changes were found for phonation time and airflow values in the MXPH protocol, while changes were also found for airflow values, subglottal pressure values and acoustic resistance values in the VOFT protocol. Although phonation time was increased in both male and female patients, gender-dependent changes were found in airflow measurements. Men's phonation time increased with no difference in airflow rate, but women's phonation time increased with decreased airflow rate and lower subglottal pressure. The changes of aerodynamic features may be affected by women's self-perceived change for vocal attitude, which was reducing sense of vocal effort after surgery.

Pronunciation Training Steps for Natural Pronunciation in In-service Training Program

  • Lim, Un
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2000년도 7월 학술대회지
    • /
    • pp.255-270
    • /
    • 2000
  • Because the accuracy is essential, in order to get the fluency in speaking, both of them are very important in English education and in-service training programs. To get the accuracy and the fluency, the causes and phenomena of the unnatural pronunciation have to be surveyed first of all. Therefore, this article surveyed the problematic and unnatural pronunciation of Korean English teachers in elementary and secondary schools using CSL and Multi-speech. And also, tried to pinpoint what the causes of unnatural pronunciation are\ulcorner Next a procedure or steps were offered for them to speak naturally through in-service training programs. Through this analysis, it was found that elementary teachers have unnatural pronunciation below, within and beyond word level, and the secondary teacher has unnatural pronunciation within and beyond word level. Therefore, pronunciation training courses have to put emphasis on segment features first, and move to suprasegmental features for elementary teachers. For secondary teachers, pronunciation training courses have to focus on word level and move to suprasegmental features, in other words beyond word level. And these pronunciation training courses have to be run integrated.

  • PDF