Search | Korea Science

Emotional Speaker Recognition using Emotional Adaptation (감정 적응을 이용한 감정 화자 인식)

Kim, Weon-Goo
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.66 no.7
- /
- pp.1105-1110
- /
- 2017
Speech with various emotions degrades the performance of the speaker recognition system. In this paper, a speaker recognition method using emotional adaptation has been proposed to improve the performance of speaker recognition system using affective speech. For emotional adaptation, emotional speaker model was generated from speaker model without emotion using a small number of training affective speech and speaker adaptation method. Since it is not easy to obtain a sufficient affective speech for training from a speaker, it is very practical to use a small number of affective speeches in a real situation. The proposed method was evaluated using a Korean database containing four emotions. Experimental results show that the proposed method has better performance than conventional methods in speaker verification and speaker recognition.
https://doi.org/10.5370/KIEE.2017.66.7.1105 인용 PDF KSCI

A comparison between affective prosodic characteristics observed in children with cochlear implant and normal hearing (인공와우 이식 아동과 정상 청력 아동의 정서적 운율 특성 비교)

Oh, Yeong Geon;Seong, Cheoljae
- Phonetics and Speech Sciences
- /
- v.8 no.3
- /
- pp.67-78
- /
- 2016
This study examined the affective prosodic characteristics observed from the children with cochlear implant (CI, hereafter) and normal hearing (NH, hereafter) along with listener's perception on them. Speech samples were acquired from 15 normal and 15 CI children. 8 SLPs(Speech Language Pathologists) perceptually evaluated affective types using Praat's ExperimentMFC. When it comes to the acoustic results, there were statistically meaningful differences between 2 groups in affective types [joy (discriminated by intensity deviation), anger (by intensity-related variables dominantly and duration-related variables partly), and sadness (by all aspects of prosodic variables)]. CI's data are much more louder when expressing joy, louder and slower when expressing anger, and higher, louder, and slower when it comes to sadness than those of NH. The listeners showed much higher correlation when evaluating normal children than CI group(p<.001). Chi-square results revealed that listeners did not show coherence at CI's utterance, but did at those of NH's (CI(p<.01), normal(p=.48)). When CI utterances were discriminated into 3 emotional types by DA(Discriminant Analysis) using 8 acoustic variables, speed related variables such as articulation rate took primary role.
https://doi.org/10.13064/KSSS.2016.8.3.067 인용 PDF KSCI

Development of cognitive-behavioral group counseling program for elementary children with speech anxiety and its effects (초등학생의 발표 불안 감소를 위한 인지적 행동주의 집단상담의 효과)

Gim, Byong-Yun
- The Korean Journal of Elementary Counseling
- /
- v.4 no.1
- /
- pp.167-190
- /
- 2005
The purpose of this study was to develop the cognitive behavioral group counseling program to relieve elementary children of speech anxiety and to examine its effects. The program was developed on the basis of the cognitive, affective, and behavioral activities. The cognitive activities were based on the REBT procedures, and affective activities included making child's nickname, finding out his own strengths and exchanging positive feedback each other and behavioral activities included training assertiveness, coaching and practicing speech behavior etc, Subjects were 14 elementary children from M elementary school in Gwangju. They had the highest scores at the speech anxiety test which was administrated to all the sixth graders of M elementary school. Seven subjects were randomly allocated to experimental group and control group respectively. Two speech anxiety tests and one speech behavior checklist were administrated as pre- and post-tests. The collected data was analyzed by ANCOVA. Research results demonstrated that the experimental group showed statistically significant changes in the scores of the speech anxiety test and the speech behavior checklist comparing with the control group. Then it was accepted that the program which was developed in this study could make effects on the reduction of elementary children's speech anxiety.
PDF

Interaction between emotional content of word and prosody in the evaluation of emotional valence (정서의미 전달에 있어서 운율과 단어 정보의 상호작용.)

Choi, Moon-Gee;Nam, Ki-Chun
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.67-70
- /
- 2007
The present paper focuses on the interaction between lexical-semantic information and affective prosody. The previous studies showed that the influence of lexical-semantic information on the affective evaluation of the prosody was relatively clear, but the influence of emotional prosody on the word evaluation remains still ambiguous. In the present, we explore whether affective prosody influence on the evaluation of affective meaning of a word and vice versa, using more ecological stimulus (sentences) than simple words. We asked participants to evaluate the emotional valence of the sentences which were recorded with affective prosody (negative, neutral, and positive) in Experiment 1 and the emotional valence of their prosodies in Experiment 2. The results showed that the emotional valence of prosody can influence on the emotional evaluation of sentences and vice versa. Interestingly, the positive prosody is likely to be more responsible to this interaction.
PDF

What you said vs. how you said it. ('어떻게 말하느냐?' vs. '무엇을 말하느냐?')

Choi, Moon-Gee;Nam, Ki-Chun
- Proceedings of the KSPS conference
- /
- 2006.11a
- /
- pp.11-13
- /
- 2006
The present paper focuses on the interaction between lexical-semantic information and affective prosody. More specifically, we explore whether affective prosody influence on evaluation of affective meaning of a word. To this end, we asked participants to listen a word and to evaluate the emotional content of the word which were recoded with affective prosody. Results showed that first, emotional evaluation was slower when the word meaning is negative than when they is positive. Second, when the prosody of words is negative, evaluation time is faster than when it is neutral or positive. And finally, when the affective meaning of word and prosody is congruent, response time is faster than it is incongruent.
PDF

Speech Emotion Recognition on a Simulated Intelligent Robot (모의 지능로봇에서의 음성 감정인식)

Jang Kwang-Dong;Kim Nam;Kwon Oh-Wook
- MALSORI
- /
- no.56
- /
- pp.173-183
- /
- 2005
We propose a speech emotion recognition method for affective human-robot interface. In the Proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes Pitch, jitter, duration, and rate of speech. Finally a pattern classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5 different directions. Experimental results show that the proposed method yields $48\%$ classification accuracy while human classifiers give $71\%$ accuracy.
PDF

Development and validation of a Korean Affective Voice Database (한국형 감정 음성 데이터베이스 구축을 위한 타당도 연구)

Kim, Yeji;Song, Hyesun;Jeon, Yesol;Oh, Yoorim;Lee, Youngmee
- Phonetics and Speech Sciences
- /
- v.14 no.3
- /
- pp.77-86
- /
- 2022
In this study, we reported the validation results of the Korean Affective Voice Database (KAV DB), an affective voice database available for scientific and clinical use, comprising a total of 113 validated affective voice stimuli. The KAV DB includes audio-recordings of two actors (one male and one female), each uttering 10 semantically neutral sentences with the intention to convey six different affective states (happiness, anger, fear, sadness, surprise, and neutral). The database was organized into three separate voice stimulus sets in order to validate the KAV DB. Participants rated the stimuli on six rating scales corresponding to the six targeted affective states by using a 100 horizontal visual analog scale. The KAV DB showed high internal consistency for voice stimuli (Cronbach's α=.847). The database had high sensitivity (mean=82.8%) and specificity (mean=83.8%). The KAV DB is expected to be useful for both academic research and clinical purposes in the field of communication disorders. The KAV DB is available for download at https://kav-db.notion.site/KAV-DB-75 39a36abe2e414ebf4a50d80436b41a.
https://doi.org/10.13064/KSSS.2022.14.3.077 인용 PDF KSCI

Speech Emotion Recognition Based on Deep Networks: A Review (딥네트워크 기반 음성 감정인식 기술 동향)

Mustaqeem, Mustaqeem;Kwon, Soonil
- Proceedings of the Korea Information Processing Society Conference
- /
- 2021.05a
- /
- pp.331-334
- /
- 2021
In the latest eras, there has been a significant amount of development and research is done on the usage of Deep Learning (DL) for speech emotion recognition (SER) based on Convolutional Neural Network (CNN). These techniques are usually focused on utilizing CNN for an application associated with emotion recognition. Moreover, numerous mechanisms are deliberated that is based on deep learning, meanwhile, it's important in the SER-based human-computer interaction (HCI) applications. Associating with other methods, the methods created by DL are presenting quite motivating results in many fields including automatic speech recognition. Hence, it appeals to a lot of studies and investigations. In this article, a review with evaluations is illustrated on the improvements that happened in the SER domain though likewise arguing the existing studies that are existence SER based on DL and CNN methods.
https://doi.org/10.3745/PKIPS.y2021m05a.331 인용 PDF

Design and implement of the Educational Humanoid Robot D2 for Emotional Interaction System (감성 상호작용을 갖는 교육용 휴머노이드 로봇 D2 개발)

Kim, Do-Woo;Chung, Ki-Chull;Park, Won-Sung
- Proceedings of the KIEE Conference
- /
- 2007.07a
- /
- pp.1777-1778
- /
- 2007
In this paper, We design and implement a humanoid robot, With Educational purpose, which can collaborate and communicate with human. We present an affective human-robot communication system for a humanoid robot, D2, which we designed to communicate with a human through dialogue. D2 communicates with humans by understanding and expressing emotion using facial expressions, voice, gestures and posture. Interaction between a human and a robot is made possible through our affective communication framework. The framework enables a robot to catch the emotional status of the user and to respond appropriately. As a result, the robot can engage in a natural dialogue with a human. According to the aim to be interacted with a human for voice, gestures and posture, the developed Educational humanoid robot consists of upper body, two arms, wheeled mobile platform and control hardware including vision and speech capability and various control boards such as motion control boards, signal processing board proceeding several types of sensors. Using the Educational humanoid robot D2, we have presented the successful demonstrations which consist of manipulation task with two arms, tracking objects using the vision system, and communication with human by the emotional interface, the synthesized speeches, and the recognition of speech commands.
PDF

Voice and Image: A Pilot Study (음성과 인상의 관계규명을 위한 실험적 연구)

Moon Seung-Jae
- MALSORI
- /
- no.35_36
- /
- pp.37-48
- /
- 1998
When we hear someone's voice, even without having met the person before, we usually make up a certain mental image of the person. This study aims at investigating the relationship between the voice and the image information carried within the voice. Does the mental picture created by the voice closely reflect the real image and if not, is it related with the real image at all\ulcorner To answer the first question, a perception experiment was carried out. Speech samples reading a short sentence from 8 males and 8 females were recorded and pictures of subjects were also taken. Ajou University students were asked to participate in the experiment to match the voice with the corresponding picture. Participants in the experiment correctly match 1 female voice and 4 male voices with their corresponding pictures. However, it is interesting to note that even in cases of mismatch, the results show that there is a very strong tendency. In other words, even though participants falsely match a certain voice with a certain picture, majority of them chose the same picture for the voice. It is the case for all mismatches. It seems that voice does give the listener a certain impression about physical characteristics even if it might not be always correct. By showing that there is a clear relationship between voice and image, this study provides a starting point for further research on voice characteristics: what characteristics of the voice carry the relevant information\ulcorner This kind of study will contribute toward the understanding of the affective domain of human voice and toward the speech technology.
PDF

Search Result 21, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)