Search | Korea Science

Detecting Data which Represent Emotion Features from the Speech Signal

Park, Chang-Hyun;Sim, Kwee-Bo;Lee, Dong-Wook;Joo, Young-Hoon
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.138.1-138
- /
- 2001
Usually, when we take a conversation with another, we can know his emotion as well as his idea. Recently, some applications using speech recognition comes out , however, those can recognize only context of various informations which he(she) gave. In the future, machine familiar to human will be a requirement for more convenient life. Therefore, we need to get emotion features. In this paper, we´ll collect a multiplicity of reference data which represent emotion features from the speech signal. As our final target is to recognize emotion from a stream of speech, as such, we must be able to understand features that represent emotion. There are much emotions human can show. the delicate difference of emotions makes this recognition problem difficult.
PDF

Real-time Background Music System for Immersive Dialogue in Metaverse based on Dialogue Emotion (메타버스 대화의 몰입감 증진을 위한 대화 감정 기반 실시간 배경음악 시스템 구현)

Kirak Kim;Sangah Lee;Nahyeon Kim;Moonryul Jung
- Journal of the Korea Computer Graphics Society
- /
- v.29 no.4
- /
- pp.1-6
- /
- 2023
To enhance immersive experiences for metaverse environements, background music is often used. However, the background music is mostly pre-matched and repeated which might occur a distractive experience to users as it does not align well with rapidly changing user-interactive contents. Thus, we implemented a system to provide a more immersive metaverse conversation experience by 1) developing a regression neural network that extracts emotions from an utterance using KEMDy20, the Korean multimodal emotion dataset 2) selecting music corresponding to the extracted emotions from an utterance by the DEAM dataset where music is tagged with arousal-valence levels 3) combining it with a virtual space where users can have a real-time conversation with avatars.
https://doi.org/10.15701/kcgs.2023.29.4.1 인용 PDF

Multi-Emotion Regression Model for Recognizing Inherent Emotions in Speech Data (음성 데이터의 내재된 감정인식을 위한 다중 감정 회귀 모델)

Moung Ho Yi;Myung Jin Lim;Ju Hyun Shin
- Smart Media Journal
- /
- v.12 no.9
- /
- pp.81-88
- /
- 2023
Recently, communication through online is increasing due to the spread of non-face-to-face services due to COVID-19. In non-face-to-face situations, the other person's opinions and emotions are recognized through modalities such as text, speech, and images. Currently, research on multimodal emotion recognition that combines various modalities is actively underway. Among them, emotion recognition using speech data is attracting attention as a means of understanding emotions through sound and language information, but most of the time, emotions are recognized using a single speech feature value. However, because a variety of emotions exist in a complex manner in a conversation, a method for recognizing multiple emotions is needed. Therefore, in this paper, we propose a multi-emotion regression model that extracts feature vectors after preprocessing speech data to recognize complex, inherent emotions and takes into account the passage of time.
https://doi.org/10.30693/SMJ.2023.12.9.81 인용 PDF

Analyzing the element of emotion recognition from speech (음성으로부터 감성인식 요소분석)

심귀보;박창현
- Journal of the Korean Institute of Intelligent Systems
- /
- v.11 no.6
- /
- pp.510-515
- /
- 2001
Generally, there are (1)Words for conversation (2)Tone (3)Pitch (4)Formant frequency (5)Speech speed, etc as the element for emotional recognition from speech signal. For human being, it is natural that the tone, vice quality, speed words are easier elements rather than frequency to perceive other s feeling. Therefore, the former things are important elements fro classifying feelings. And, previous methods have mainly used the former thins but using formant is good for implementing as machine. Thus. our final goal of this research is to implement an emotional recognition system based on pitch, formant, speech speed, etc. from speech signal. In this paper, as first stage we foun specific features of feeling angry from his words when a man got angry.
PDF

Applying Social Strategies for Breakdown Situations of Conversational Agents: A Case Study using Forewarning and Apology (대화형 에이전트의 오류 상황에서 사회적 전략 적용: 사전 양해와 사과를 이용한 사례 연구)

Lee, Yoomi;Park, Sunjeong;Suk, Hyeon-Jeong
- Science of Emotion and Sensibility
- /
- v.21 no.1
- /
- pp.59-70
- /
- 2018
With the breakthrough of speech recognition technology, conversational agents have become pervasive through smartphones and smart speakers. The recognition accuracy of speech recognition technology has developed to the level of human beings, but it still shows limitations on understanding the underlying meaning or intention of words, or understanding long conversation. Accordingly, the users experience various errors when interacting with the conversational agents, which may negatively affect the user experience. In addition, in the case of smart speakers with a voice as the main interface, the lack of feedback on system and transparency was reported as the main issue when the users using. Therefore, there is a strong need for research on how users can better understand the capability of the conversational agents and mitigate negative emotions in error situations. In this study, we applied social strategies, "forewarning" and "apology", to conversational agent and investigated how these strategies affect users' perceptions of the agent in breakdown situations. For the study, we created a series of demo videos of a user interacting with a conversational agent. After watching the demo videos, the participants were asked to evaluate how they liked and trusted the agent through an online survey. A total of 104 respondents were analyzed and found to be contrary to our expectation based on the literature study. The result showed that forewarning gave a negative impression to the user, especially the reliability of the agent. Also, apology in a breakdown situation did not affect the users' perceptions. In the following in-depth interviews, participants explained that they perceived the smart speaker as a machine rather than a human-like object, and for this reason, the social strategies did not work. These results show that the social strategies should be applied according to the perceptions that user has toward agents.
https://doi.org/10.14695/KJSOS.2018.21.1.59 인용 PDF KSCI

Search Result 5, Processing Time 0.019 seconds

Detecting Data which Represent Emotion Features from the Speech Signal

Real-time Background Music System for Immersive Dialogue in Metaverse based on Dialogue Emotion (메타버스 대화의 몰입감 증진을 위한 대화 감정 기반 실시간 배경음악 시스템 구현)

Multi-Emotion Regression Model for Recognizing Inherent Emotions in Speech Data (음성 데이터의 내재된 감정인식을 위한 다중 감정 회귀 모델)

Analyzing the element of emotion recognition from speech (음성으로부터 감성인식 요소분석)

Applying Social Strategies for Breakdown Situations of Conversational Agents: A Case Study using Forewarning and Apology (대화형 에이전트의 오류 상황에서 사회적 전략 적용: 사전 양해와 사과를 이용한 사례 연구)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)