• Title/Summary/Keyword: 화자 특징

Search Result 299, Processing Time 0.024 seconds

The Study on Body Language in Animation as Functional Aspects -Focusing on Mulan, Beauty and the beast, Aladdin, Sinbad- (기능론적 관점에서 본 애니메이션의 신체언어 연구 - 뮬란, 미녀와 야수, 알라딘, 신밧드를 중심으로-)

  • Chung, Mi-Ghang;Lee, Mi-Young;Kim, Sung-Hee;Kim, Jae-Ho
    • Archives of design research
    • /
    • v.20 no.1 s.69
    • /
    • pp.55-64
    • /
    • 2007
  • Non-verbal communications are important because they support and replace verbal communication. Body language of various non-verbal communications is the communication using the body. In animation, expression of body language is very important because characters play an important role in communicating the scenario. Animation has a dual communication structure, different from general communication. One is the communication between the speaker character and the hearer character, the other is the image and the audience, which includes the communication between the speaker character and the hearer character. In this study, we divide the body language from the characters into the discourse-in act and discourse-out act according to this dual structure and classify it into adaptors, emblem, illustrator, regulator, affect display by a functional approach method. Especially, the illustrator is subdivided into pragmatic speech act. Finally, this study analyzes the features of body language in animation and represents animation character's body language for an effective expression of the communications in animation.

  • PDF

Voice Command Web Browser Using Variable Vocabulary Word Recognizer (가변어휘 단어 인식기를 사용한 음성 명령 웹 브라우저)

  • 이항섭
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.2
    • /
    • pp.48-52
    • /
    • 1999
  • In this paper, we describe a Voice Command Web Browser using a variable vocabulary word recognizer that can do Internet surfing with Korean speech recognition on the Web. The feature of this browser is that it can handle the links and menus of the web browser by speech. Therefore, we can use speech interface together with mouse for web browsing. To recognize the recognition candidates dynamically changing according to Web pages, we use the variable vocabulary word recognizer. The recognizer was trained using POW (Phonetically Optimized Words) 3,848 words. So that it can recognize new words which did not exist in training data. The preliminary test results showed that the performance of speaker-independent and vocabulary-independent recognition is 93.8% for 32 Korean words. The Voice Command Web Browser was developed on windows 95/NT using Netscape Navigator and reflected usability test results in order to offer easy interface to users unfamiliar with speech interface. In on-line experiment of speaker-independent and environment-independent situation, Voice Command Web Browser showed recognition accuracy of 90%.

  • PDF

On the Development of a Continuous Speech Recognition System Using Continuous Hidden Markov Model for Korean Language (연속분포 HMM을 이용한 한국어 연속 음성 인식 시스템 개발)

  • Kim, Do-Yeong;Park, Yong-Kyu;Kwon, Oh-Wook;Un, Chong-Kwan;Park, Seong-Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1
    • /
    • pp.24-31
    • /
    • 1994
  • In this paper, we report on the development of a speaker independent continuous speech recognition system using continuous hidden Markov models. The continuous hidden Markov model consists of mean and covariance matrices and directly models speech signal parameters, therefore does not have quantization error. Filter bank coefficients with their 1st and 2nd-order derivatives are used as feature vectors to represent the dynamic features of speech signal. We use the segmental K-means algorithm as a training algorithm and triphone as a recognition unit to alleviate performance degradation due to coarticulation problems critical in continuous speech recognition. Also, we use the one-pass search algorithm that Is advantageous in speeding-up the recognition time. Experimental results show that the system attains the recognition accuracy of $83\%$ without grammar and $94\%$ with finite state networks in speaker-indepdent speech recognition.

  • PDF

Implementation of a Speech Recognition System for a Car Navigation System (차량 항법용 음성인식 시스템의 구현)

  • Lee, Tae-Han;Yang, Tae-Young;Park, Sang-Taick;Lee, Chung-Yong;Youn, Dae-Hee;Cha, Il-Hwan
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.9
    • /
    • pp.103-112
    • /
    • 1999
  • In this paper, a speaker-independent isolated world recognition system for a car navigation system is implemented using a general digital signal processor. This paper presents a method combining SNR normalization with RAS as a noise processing method. The semi-continuous hidden markov model is adopted and TMS320C31 is used in implementing the real-time system. Recognition word set is composed of 69 command words for a car navigation system. Experimental results showed that the recognition performance has a maximum of 93.62% in case of a combination of SNR normalization and spectral subtraction, and the performance improvement rate of the system is 3.69%, Presented noise processing method showed good speech recognition performance in 5dB SNR in car environment.

  • PDF

Wide Coverage Microphone System for Lecture Using Ceiling-Mounted Array Structure (천정형 배열 마이크를 이용한 강의용 광역 마이크 시스템)

  • Oh, Woojin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.4
    • /
    • pp.624-633
    • /
    • 2018
  • While the multimedia lecture system has been getting smart using immerging technology, the microphone still relies on the classical approach such as holding in hand or attaching on the body. In this paper, we propose a ceiling mounted array microphone system that allows a wide reception coverage and instructors to move freely without attaching microphone. The proposed system adopts cell and handover of mobile communication instead of a complicated beamforming method and implements a wide range microphone over several cells with low cost. Since the characteristics of unvoiced speech is similar to Pseudo Noise it is shown that soft handover are possible with 3 microphones connected to delay-sum multipath receiver. The proposed system is tested in $6.3{\times}1.5m$ area. For real-time processing the correlation range can be reduced by 82% or more, and the output latency delay can be improved by using the delay adaptive filter.

Popular Song's Lyrics as Tourism Language: Focusing on Exoticism (관광언어로서 대중가요 노랫말 : 이국성을 중심으로)

  • Yang, Soung-hoon;Choi, Mun-yong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.6
    • /
    • pp.3778-3786
    • /
    • 2015
  • This paper aimed to explore the possible popular song's linkage to tourism promotion material. Especially focusing on the lyrics of Korean popular song, which explicitly containing exoticism in title or lyrics in early and mid of 20 centuries, this research verified how authors represented the one of Dann's language of tourism thesis: strangeness perspective in tourism and language of differentiation. Result suggested that song writers, as of their imaginary by-product or personal experience, present exoticism in lyrics of song in various ways: exaggerating, creating, stereo-typing exotic place. Fulfilling the Dann's idea, songs elaborately delivered familiar words, conspicuously by personifying place, allocating young ladies and mentioning origin or hometown. The research was efforts to find the origin of entertainment contents savvy-tourism promotion and also traced potential tourist's collective imagination toward potential overseas tourism destination.

Comparison of prosodic characteristics by question type in left- and right-hemisphere-injured stroke patients (좌반구 손상과 우반구 손상 뇌졸중 환자의 의문문 유형에 따른 운율 특성 비교)

  • Yu, Youngmi;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.1-13
    • /
    • 2021
  • This study examined the characteristics of linguistic prosody in terms of cerebral lateralization in three groups of 9 healthy speakers and 14 speakers with a history of stroke (7 with left hemisphere damage (LHD), 7 with right hemisphere damage (RHD)). Specifically, prosodic characteristics related to speech rate, duration, pitch, and intensity were examined in three types of interrogative sentences (wh-questions, yes-no questions, alternative questions) with auditory perceptual evaluation. As a result, the statistically significant key variables showed flaws in production of the linguistic prosody in the speakers with LHD. The statistically significant variables were more insufficiently produced for wh-questions than for yes-no and alternative questions. This trend was particularly noticeable in variables related to pitch and speech rate. This result suggests that when Korean speakers process linguistic prosody, such as that of lexico-semantic and syntactic information in interrogative sentences, the left hemisphere seems to be superior to the right hemisphere.

Characteristics of Vowel Formants, Voice Intensity, and Fundamental Frequency of Female with Amyotrophic Lateral Sclerosis using Spectrograms (스펙트로그램을 이용한 근위축성측삭경화증 여성 화자의 모음 포먼트, 음성강도, 기본주파수의 변화)

  • Byeon, Haewon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.9
    • /
    • pp.193-198
    • /
    • 2019
  • This study analyzed the changes of vowel formant, voice intensity, and fundamental frequency of vowels for 11 months using acoustochemical spectrogram analysis of women diagnosed with amyotrophic lateral sclerosis (ALS). The test word was a vowel /a, i, u/ and a diphthong /h + ja + da/, /h + wi + da/, and /h +ɰi+ da/. Speech data were collected through the word reading task presented on the monitor using 'Alvin' program, and the recording environment was set to 5,500 Hz for the nyquist frequency and 11,000 Hz for the sampling rate. The records were analyzed by using spectrograms to vowel formants, voice intensity, and fundamental frequency. As a result of analysis, the fundamental frequency and intensity of the ALS process were decreased and the formant slope of the diphthong was decreased rather than the formant change in the vowel. This result suggests that the vowel distortion of ALS due to disease progression is due to the decrease of tongue and jaw co morbidity.

Pronunciation of the Korean diphthong /jo/: Phonetic realizations and acoustic properties (한국어 /ㅛ/의 발음 양상 연구: 발음형 빈도와 음향적 특징을 중심으로)

  • Hyangwon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.9-17
    • /
    • 2023
  • The purpose of this study is to determine how the Korean diphthong /jo/ shows phonetic variation in various linguistic environments. The pronunciation of /jo/ is discussed, focusing on the relationship between phonetic variation and the distribution range of vowels. The location in a word (monosyllable, word-initial, word-medial, word-final) and word class (content word, function word) were analyzed using the speech of 10 female speakers of the Seoul Corpus. As a result of determining the frequency of appearance of /jo/ in each environment, the pronunciation type and word class were affected by the location in a word. Frequent phonetic reduction was observed in the function word /jo/ in the acoustic analysis. The word class did not change the average phonetic values of /jo/, but changed the distribution of individual tokens. These results indicate that the linguistic environment affects the phonetic distribution of vowels.

A Study on Robust Speech Emotion Feature Extraction Under the Mobile Communication Environment (이동통신 환경에서 강인한 음성 감성특징 추출에 대한 연구)

  • Cho Youn-Ho;Park Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.6
    • /
    • pp.269-276
    • /
    • 2006
  • In this paper, we propose an emotion recognition system that can discriminate human emotional state into neutral or anger from the speech captured by a cellular-phone in real time. In general. the speech through the mobile network contains environment noise and network noise, thus it can causes serious System performance degradation due to the distortion in emotional features of the query speech. In order to minimize the effect of these noise and so improve the system performance, we adopt a simple MA (Moving Average) filter which has relatively simple structure and low computational complexity, to alleviate the distortion in the emotional feature vector. Then a SFS (Sequential Forward Selection) feature optimization method is implemented to further improve and stabilize the system performance. Two pattern recognition method such as k-NN and SVM is compared for emotional state classification. The experimental results indicate that the proposed method provides very stable and successful emotional classification performance such as 86.5%. so that it will be very useful in application areas such as customer call-center.