Search | Korea Science

Speech Emotion Recognition on a Simulated Intelligent Robot (모의 지능로봇에서의 음성 감정인식)

Jang Kwang-Dong;Kim Nam;Kwon Oh-Wook
- MALSORI
- /
- no.56
- /
- pp.173-183
- /
- 2005
We propose a speech emotion recognition method for affective human-robot interface. In the Proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes Pitch, jitter, duration, and rate of speech. Finally a pattern classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5 different directions. Experimental results show that the proposed method yields $48\%$ classification accuracy while human classifiers give $71\%$ accuracy.
PDF

Context sentiment analysis based on Speech Tone (발화 음성을 기반으로 한 감정분석 시스템)

Jung, Jun-Hyeok;Park, Soo-Duck;Kim, Min-Seung;Park, So-Hyun;Han, Sang-Gon;Cho, Woo-Hyun
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.11a
- /
- pp.1037-1040
- /
- 2017
현재 머신러닝과 딥러닝의 기술이 빠른 속도로 발전하면서 수많은 인공지능 음성 비서가 출시되고 있지만, 발화자의 문장 내 존재하는 단어만 분석하여 결과를 반환할 뿐, 비언어적 요소는 인식할 수 없기 때문에 결과의 구조적인 한계가 존재한다. 따라서 본 연구에서는 인간의 의사소통 내 존재하는 비언어적 요소인 말의 빠르기, 성조의 변화 등을 수치 데이터로 변환한 후, "플루칙의 감정 쳇바퀴"를 기초로 지도학습 시키고, 이후 입력되는 음성 데이터를 사전 기계학습 된 데이터를 기초로 kNN 알고리즘을 이용하여 분석한다.
https://doi.org/10.3745/PKIPS.y2017m11a.1037 인용 PDF

Speech Emotion Recognition by Speech Signals on a Simulated Intelligent Robot (모의 지능로봇에서 음성신호에 의한 감정인식)

Jang, Kwang-Dong;Kwon, Oh-Wook
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.163-166
- /
- 2005
We propose a speech emotion recognition method for natural human-robot interface. In the proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes pitch, jitter, duration, and rate of speech. Finally a patten classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5different directions. Experimental results show that the proposed method yields 59% classification accuracy while human classifiers give about 50%accuracy, which confirms that the proposed method achieves performance comparable to a human.
PDF

An acoustic study of feeling between standard language and dialect (표준어와 방언간의 감정변화에 대한 음성적 연구)

Lee, Yeon-Soo;Park, Young-Beom
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.04a
- /
- pp.63-66
- /
- 2009
사람의 감정 변화에는 크게 기쁨, 슬픔, 흥분, 보통 4가지 상태로 말할 수 있다. 이 4가지 상태에서 기쁨과 슬픔, 흥분과 기쁨 상태가 음성학적으로 비슷한 형태를 가지고 있다. 흥분과 기쁨의 상태에서 방언의 노말 상태가 표준어의 기쁨, 흥분상태와 비슷한 특징을 가지고 있다. 이와 같은 표준어와 방언 간의 특징 때문에 흥분 상태를 인지하는 경우 방언의 보통상태가 흥분상태로 잘못 인식되는 경우가 발생 한다. 본 논문에서는 이와 같은 문제점이 발생하는 음성학적인 차이를 구분 하고자 한다. 이들을 비교하기 위해 Pitch, Formant와 Formant RMS error 3가지 요소를 통하여 표준어와 방언간의 흥분 상태를 연구 하였다.
https://doi.org/10.3745/PKIPS.y2009m04a.63 인용 PDF

Development of Driver's Emotion and Attention Recognition System using Multi-modal Sensor Fusion Algorithm (다중 센서 융합 알고리즘을 이용한 운전자의 감정 및 주의력 인식 기술 개발)

Han, Cheol-Hun;Sim, Kwee-Bo
- Journal of the Korean Institute of Intelligent Systems
- /
- v.18 no.6
- /
- pp.754-761
- /
- 2008
As the automobile industry and technologies are developed, driver's tend to more concern about service matters than mechanical matters. For this reason, interests about recognition of human knowledge and emotion to make safe and convenient driving environment for driver are increasing more and more. recognition of human knowledge and emotion are emotion engineering technology which has been studied since the late 1980s to provide people with human-friendly services. Emotion engineering technology analyzes people's emotion through their faces, voices and gestures, so if we use this technology for automobile, we can supply drivels with various kinds of service for each driver's situation and help them drive safely. Furthermore, we can prevent accidents which are caused by careless driving or dozing off while driving by recognizing driver's gestures. the purpose of this paper is to develop a system which can recognize states of driver's emotion and attention for safe driving. First of all, we detect a signals of driver's emotion by using bio-motion signals, sleepiness and attention, and then we build several types of databases. by analyzing this databases, we find some special features about drivers' emotion, sleepiness and attention, and fuse the results through Multi-Modal method so that it is possible to develop the system.
https://doi.org/10.5391/JKIIS.2008.18.6.754 인용 PDF KSCI

Speaker and Context Independent Emotion Recognition using Speech Signal (음성을 이용한 화자 및 문장독립 감정인식)

강면구;김원구
- Proceedings of the IEEK Conference
- /
- 2002.06d
- /
- pp.377-380
- /
- 2002
In this paper, speaker and context independent emotion recognition using speech signal is studied. For this purpose, a corpus of emotional speech data recorded and classified according to the emotion using the subjective evaluation were used to make statical feature vectors such as average, standard deviation and maximum value of pitch and energy and to evaluate the performance of the conventional pattern matching algorithms. The vector quantization based emotion recognition system is proposed for speaker and context independent emotion recognition. Experimental results showed that vector quantization based emotion recognizer using MFCC parameters showed better performance than that using the Pitch and energy Parameters.
PDF

Unraveling Emotions in Speech: Deep Neural Networks for Emotion Recognition (음성을 통한 감정 해석: 감정 인식을 위한 딥 뉴럴 네트워크 예비 연구)

Edward Dwijayanto Cahyadi;Mi-Hwa Song
- Proceedings of the Korea Information Processing Society Conference
- /
- 2023.05a
- /
- pp.411-412
- /
- 2023
Speech emotion recognition(SER) is one of the interesting topics in the machine learning field. By developing SER, we can get numerous benefits. By using a convolutional neural network and Long Short Term Memory (LSTM ) method as a part of Artificial intelligence, the SER system can be built.
https://doi.org/10.3745/PKIPS.y2023m05a.411 인용 PDF

Acoustic parameters for induced emotion categorizing and dimensional approach (자연스러운 정서 반응의 범주 및 차원 분류에 적합한 음성 파라미터)

Park, Ji-Eun;Park, Jeong-Sik;Sohn, Jin-Hun
- Science of Emotion and Sensibility
- /
- v.16 no.1
- /
- pp.117-124
- /
- 2013
This study examined that how precisely MFCC, LPC, energy, and pitch related parameters of the speech data, which have been used mainly for voice recognition system could predict the vocal emotion categories as well as dimensions of vocal emotion. 110 college students participated in this experiment. For more realistic emotional response, we used well defined emotion-inducing stimuli. This study analyzed the relationship between the parameters of MFCC, LPC, energy, and pitch of the speech data and four emotional dimensions (valence, arousal, intensity, and potency). Because dimensional approach is more useful for realistic emotion classification. It results in the best vocal cue parameters for predicting each of dimensions by stepwise multiple regression analysis. Emotion categorizing accuracy analyzed by LDA is 62.7%, and four dimension regression models are statistically significant, p<.001. Consequently, this result showed the possibility that the parameters could also be applied to spontaneous vocal emotion recognition.
PDF

Voice Recognition Chatbot System for an Aging Society: Technology Development and Customized UI/UX Design (고령화 사회를 위한 음성 인식 챗봇 시스템 : 기술 개발과 맞춤형 UI/UX 설계)

Yun-Ji Jeong;Min-Seong Yu;Joo-Young Oh;Hyeon-Seok Hwang;Won-Whoi Hun
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.24 no.4
- /
- pp.9-14
- /
- 2024
This study developed a voice recognition chatbot system to address depression and loneliness among the elderly in an aging society. The system utilizes the Whisper model, GPT 2.5, and XTTS2 to provide high-performance voice recognition, natural language processing, and text-to-speech conversion. Users can express their emotions and states and receive appropriate responses, with voice recognition functionality using familiar voices for comfort and reassurance. The UX/UI design considers the cognitive responses, visual impairments, and physical limitations of the smart senior generation, using high contrast colors and readable fonts for enhanced usability. This research is expected to improve the quality of life for the elderly through voice-based interfaces.
https://doi.org/10.7236/JIIBC.2024.24.4.9 인용 PDF HTML

Trends in Brain Wave Signal and Application Technology (뇌파신호 및 응용 기술 동향)

Kim, D.Y.;Lee, J.H.;Park, M.H.;Choi, Y.H.;Park, Y.O.
- Electronics and Telecommunications Trends
- /
- v.32 no.2
- /
- pp.19-28
- /
- 2017
뇌파신호는 사람의 생각이나 감정을 가장 현실적인 방법으로 취득하여 해석하고 분석할 수 있는 유용한 정보원이다. 뇌파는 음성인식 이후에 사람과 사람, 사람과 사물, 사람과 컴퓨터 간에 편리하고 가장 자연스러운 초연결(Hyper-Connection) 접속과 통신을 가능하게 하는 유력하고 궁극적인 수단이다. 하지만 뇌파를 두뇌 활동 시 발생하는 신경세포와 신경세포 사이에 형성된 시냅스들의 화학적 활성화에 의한 전자기적 신호 평균의 총합으로만 해석하는 한, 뇌과학에서 이룩한 복잡한 사람의 생각과 감정 패턴과의 연결 해석이 불가능한 한계가 발생한다. 본고에서는 이를 극복하여 뇌파를 미래의 초연결 접속과 통신 수단으로 활용 가능하도록 하기 위한 기술적 가치와 가능성을 재발견하기 위하여 뇌과학에서 밝혀지고 있는 생각과 감정회로와 연동 해석하기 위한 뇌파신호의 처리, 해석 및 응용 기술 동향에 대해 기술한다.
PDF

Search Result 138, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)