• Title/Summary/Keyword: 감정 음성

Search Result 235, Processing Time 0.026 seconds

On the Implementation of a Facial Animation Using the Emotional Expression Techniques (FAES : 감성 표현 기법을 이용한 얼굴 애니메이션 구현)

  • Kim Sang-Kil;Min Yong-Sik
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.2
    • /
    • pp.147-155
    • /
    • 2005
  • In this paper, we present a FAES(a Facial Animation with Emotion and Speech) system for speech-driven face animation with emotions. We animate face cartoons not only from input speech, but also based on emotions derived from speech signal. And also our system can ensure smooth transitions and exact representation in animation. To do this, after collecting the training data, we have made the database using SVM(Support Vector Machine) to recognize four different categories of emotions: neutral, dislike, fear and surprise. So that, we can make the system for speech-driven animation with emotions. Also, we trained on Korean young person and focused on only Korean emotional face expressions. Experimental results of our system demonstrate that more emotional areas expanded and the accuracies of the emotional recognition and the continuous speech recognition are respectively increased 7% and 5% more compared with the previous method.

  • PDF

Multi-Emotion Recognition Model with Text and Speech Ensemble (텍스트와 음성의 앙상블을 통한 다중 감정인식 모델)

  • Yi, Moung Ho;Lim, Myoung Jin;Shin, Ju Hyun
    • Smart Media Journal
    • /
    • v.11 no.8
    • /
    • pp.65-72
    • /
    • 2022
  • Due to COVID-19, the importance of non-face-to-face counseling is increasing as the face-to-face counseling method has progressed to non-face-to-face counseling. The advantage of non-face-to-face counseling is that it can be consulted online anytime, anywhere and is safe from COVID-19. However, it is difficult to understand the client's mind because it is difficult to communicate with non-verbal expressions. Therefore, it is important to recognize emotions by accurately analyzing text and voice in order to understand the client's mind well during non-face-to-face counseling. Therefore, in this paper, text data is vectorized using FastText after separating consonants, and voice data is vectorized by extracting features using Log Mel Spectrogram and MFCC respectively. We propose a multi-emotion recognition model that recognizes five emotions using vectorized data using an LSTM model. Multi-emotion recognition is calculated using RMSE. As a result of the experiment, the RMSE of the proposed model was 0.2174, which was the lowest error compared to the model using text and voice data, respectively.

The Voice of Proffessional Voice Users (음성직업인의 음성에 관하여)

  • 문영일
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.4 no.1
    • /
    • pp.6-11
    • /
    • 1991
  • 인류발생의 시초부터 인간은 음성을 의사전달, 지식전달, 감정표현의 수단으로 사용하여 왔다. 인간이 만물의 영장이라는 것은 다른 동물과는 달리 음성을 사용하여 말을 할 수 있다는 것이며, 이를 통하여 한 세대에서 다음 세대로 지식과 고유한 문화를 전수해 나갈 수 있는 것이다. (중략)

  • PDF

Development of Context Awareness and Service Reasoning Technique for Handicapped People (멀티 모달 감정인식 시스템 기반 상황인식 서비스 추론 기술 개발)

  • Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.1
    • /
    • pp.34-39
    • /
    • 2009
  • As a subjective recognition effect, human's emotion has impulsive characteristic and it expresses intentions and needs unconsciously. These are pregnant with information of the context about the ubiquitous computing environment or intelligent robot systems users. Such indicators which can aware the user's emotion are facial image, voice signal, biological signal spectrum and so on. In this paper, we generate the each result of facial and voice emotion recognition by using facial image and voice for the increasing convenience and efficiency of the emotion recognition. Also, we extract the feature which is the best fit information based on image and sound to upgrade emotion recognition rate and implement Multi-Modal Emotion recognition system based on feature fusion. Eventually, we propose the possibility of the ubiquitous computing service reasoning method based on Bayesian Network and ubiquitous context scenario in the ubiquitous computing environment by using result of emotion recognition.

A Study on the Comparison of the Commercial API for Recognizing Speech with Emotion (상용 API 의 감정에 따른 음성 인식 성능 비교 연구)

  • Janghoon Yang
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.52-54
    • /
    • 2023
  • 최근 인공지능 기술의 발전에 따라서 다양한 서비스에서 음성 인식을 활용한 서비스를 제공하면서 음성 인식에 대한 중요성이 증가하고 있다. 이 논문에서는 국내에서 많이 사용되고 있는 대표적인 인공지능 서비스 API 를 제공하는 구글, ETRI, 네이버에 대해서 감정 음성 관점에서 그 차이를 평가하였다. AI Hub 에서 제공하는 감성 대화 말뭉치 데이터 셋의 일부인 음성 테스트 데이터를 사용하여 평가한 결과 ETRI API 가 문자 오류율 (1.29%)과 단어 오류율(10.1%)의 성능 지표에 대해서 가장 우수한 음성 인식 성능을 보임을 확인하였다.

Analyzing the Acoustic Elements and Emotion Recognition from Speech Signal Based on DRNN (음향적 요소분석과 DRNN을 이용한 음성신호의 감성 인식)

  • Sim, Kwee-Bo;Park, Chang-Hyun;Joo, Young-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.45-50
    • /
    • 2003
  • Recently, robots technique has been developed remarkably. Emotion recognition is necessary to make an intimate robot. This paper shows the simulator and simulation result which recognize or classify emotions by learning pitch pattern. Also, because the pitch is not sufficient for recognizing emotion, we added acoustic elements. For that reason, we analyze the relation between emotion and acoustic elements. The simulator is composed of the DRNN(Dynamic Recurrent Neural Network), Feature extraction. DRNN is a learning algorithm for pitch pattern.

A Design of Artificial Emotion Model (인공 감정 모델의 설계)

  • Lee, In-Geun;Seo, Seok-Tae;Jeong, Hye-Cheon;Gwon, Sun-Hak
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.04a
    • /
    • pp.58-62
    • /
    • 2007
  • 인간이 생성한 음성, 표정 영상, 문장 등으로부터 인간의 감정 상태를 인식하는 연구와 함께, 인간의 감정을 모방하여 다양한 외부 자극으로 감정을 생성하는 인공 감정(Artificial Emotion)에 관한 연구가 이루어지고 있다. 그러나 기존의 인공 감정 연구는 외부 감정 자극에 대한 감정 변화 상태를 선형적, 지수적으로 변화시킴으로써 감정 상태가 급격하게 변하는 형태를 보인다. 본 논문에서는 외부 감정 자극의 강도와 빈도뿐만 아니라 자극의 반복 주기를 감정 상태에 반영하고, 시간에 따른 감정의 변화를 Sigmoid 곡선 형태로 표현하는 감정 생성 모델을 제안한다. 그리고 기존의 감정 자극에 대한 회상(recollection)을 통해 외부 감정 자극이 없는 상황에서도 감정을 생성할 수 있는 인공 감정 시스템을 제안한다.

  • PDF

Analyzing the acoustic elements and Emotion Recogintion from Speech Signal based on DRNN (음향적 요소분석과 DRNN을 이용한 음성신호의 감성인식)

  • 박창현;심귀보
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.489-492
    • /
    • 2002
  • 최근 인간형 로봇에 대한 개발이 괄목할 만한 성장을 이루고 있고, 친근한 로봇의 개발에 중요한 역할을 담당하는 것으로써 감성/감정의 인식이 필수적이라는 인식이 확산되고 있다. 본 논문은 음성의 감정인식에 있어 가장 큰 부분을 차지하는 피치의 패턴을 인식하여 감정을 분류/인식하는 시뮬레이터의 개발과 실험결과를 나타낸다. 또한, 피치뿐 아니라 음향학적으로 날카로움, 낮음등의 요소를 분류의 기준으로 포함시켜서 좀더 신뢰성 있는 인식을 할 수 있음을 보인다. 시뮬레이터의 내부 구조로는 음성으로부터 피치를 추출하는 부분과 피치의 패턴을 학습시키는 DRNN 부분, 그리고, 음향적 특성을 추출하는 음향 추출부가 주요 요소로 이루어져 있다. 그리고, 피치를 추출하는 방법으로는 Center-Clipping 함수를 이용한 autocorrelation approach를 사용하고, 학습 시 최적의 개체를 찾는 방법으로써 (1+100)-ES를 사용한다.

Acousitc analyses in the imitation of emotional speech in children with typical development (일반 아동의 감정 발화 모방 능력: 음향학적 분석을 중심으로)

  • Subeen Kim;Jungeun Kim;Soohyoung Cho;Hyosun Lee;Seongyun Moon;Youngmee Lee
    • Phonetics and Speech Sciences
    • /
    • v.16 no.3
    • /
    • pp.49-57
    • /
    • 2024
  • This study aimed to investigate the acoustic characteristics of emotional speech in typically developing children. Thirteen preschoolers (4-5.9 years old) and 22 school-aged children (6-9.9 years old) participated in the study. The children were asked to imitate 15 utterances based on emotional utterances representing three different emotional expressions (happy, sad, and angry). Basic measures of the frequency, intensity, and duration of emotional expressions in the children's utterances were obtained as averages. We found that both preschoolers and school-aged children differentially imitated the emotional utterances in terms of basic frequency, intensity, and duration depending on the type of emotion (happy, sad, angry). In particular, we found that school-aged children spoke more slowly than preschoolers when expressing sadness. These results suggest that preschoolers and school-aged children can express emotions by modulating vocal pitch, intensity, and duration. In addition, school-aged children tended to modulate the duration parameter of prosodic elements to express different emotions compared to preschoolers. In general, differences in duration between developing children may be influenced by the maturity of the child's speech and language development.