• Title/Summary/Keyword: 감정 음성

Search Result 235, Processing Time 0.022 seconds

지능형 로봇과 얼굴 인식 융합기술

  • Kee, Seok-Cheol
    • Review of KIISC
    • /
    • v.17 no.5
    • /
    • pp.25-31
    • /
    • 2007
  • IT기술과 지능을 로봇에 융합시킴으로써, 로봇이 스스로 사용자를 인식하여 사용자가 원하는 일을 하고 원하는 정보를 검색해 주는 인간 중심적 서비스를 제공하는 것이 지능형 로봇의 궁극적인 목표이다. 사용자가 원하는 서비스를 제공하기 위해서는 다양한 의사소통 채널을 통해 인간과 로봇, 두 개체간의 상호작용 및 의사소통 연결 고리를 형성하는 인간-로봇 상호작용(HRI: Human-Robot Interaction)기술 개발이 반드시 필요하다. HRI 기술에는 얼굴 인식, 음성 인식, 제스처 인식 및 감정 인식 등 로봇이 인간의 의사표시를 인식하기 위한 기술들이 있다. 본고에서는 지능형 로봇과 로봇의 시각 지능화의 가장 핵심적인 기능인 얼굴 인식의 융합 기술 동향에 대해서 응용 서비스 및 표준화 이슈를 중심으로 살펴보고자 한다.

Speech and Textual Data Fusion for Emotion Detection: A Multimodal Deep Learning Approach (감정 인지를 위한 음성 및 텍스트 데이터 퓨전: 다중 모달 딥 러닝 접근법)

  • Edward Dwijayanto Cahyadi;Mi-Hwa Song
    • Annual Conference of KIPS
    • /
    • 2023.11a
    • /
    • pp.526-527
    • /
    • 2023
  • Speech emotion recognition(SER) is one of the interesting topics in the machine learning field. By developing multi-modal speech emotion recognition system, we can get numerous benefits. This paper explain about fusing BERT as the text recognizer and CNN as the speech recognizer to built a multi-modal SER system.

An approach to utilize human empathy measurement (인간의 공감 측정에 대한 기술 및 활용방안)

  • Jin, Jung-A;Kim, Sun-Woo;Choi, Yeon-Sung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.1
    • /
    • pp.32-37
    • /
    • 2016
  • When another person is injured finger in the car door do you know that feeling? Do you feel the emotion reading a novel? How do you feel the emotion of other person? This is due to empathy. Empathy is feel emotion or react emotionally of other person. Empathy is feel together. In this paper, we are going to describe how to measure these empathy. Before you begin, it is not easy to measure human empathy. The reason is that humans have communications system is diverse. Because it is different feel and be expressed sympathy for each person, is required various data analysis. In this paper, we proposed a new method to utilize Head Nodding and a short speech units (1sec : A-ha, Yes, Good etc.).

An acoustical analysis of emotional speech using close-copy stylization of intonation curve (억양의 근접복사 유형화를 이용한 감정음성의 음향분석)

  • Yi, So Pae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.131-138
    • /
    • 2014
  • A close-copy stylization of intonation curve was used for an acoustical analysis of emotional speech. For the analysis, 408 utterances of five emotions (happiness, anger, fear, neutral and sadness) were processed to extract acoustical feature values. The results show that certain pitch point features (pitch point movement time and pitch point distance within a sentence) and sentence level features (pitch range of a final pitch point, pitch range of a sentence and pitch slope of a sentence) are affected by emotions. Pitch point movement time, pitch point distance within a sentence and pitch slope of a sentence show no significant difference between male and female participants. The emotions with high arousal (happiness and anger) are consistently distinguished from the emotion with low arousal (sadness) in terms of these acoustical features. Emotions with higher arousal show steeper pitch slope of a sentence. They have steeper pitch slope at the end of a sentence. They also show wider pitch range of a sentence. The acoustical analysis in this study implies the possibility that the measurement of these acoustical features can be used to cluster and identify emotions of speech.

Integrated Verbal and Nonverbal Sentiment Analysis System for Evaluating Reliability of Video Contents (영상 콘텐츠의 신뢰도 평가를 위한 언어와 비언어 통합 감성 분석 시스템)

  • Shin, Hee Won;Lee, So Jeong;Son, Gyu Jin;Kim, Hye Rin;Kim, Yoonhee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.4
    • /
    • pp.153-160
    • /
    • 2021
  • With the advent of the "age of video" due to the simplification of video content production and the convenience of broadcasting channel operation, review videos on various products are drawing attention. We proposes RASIA, an integrated reliability analysis system based on verbal and nonverbal sentiment analysis of review videos. RASIA extracts and quantifies each emotional value obtained through language sentiment analysis and facial analysis of the reviewer in the video. Subsequently, we conduct an integrated reliability analysis of standardized verbal and nonverbal sentimental values. RASIA provide an new objective indicator to evaluate the reliability of the review video.

A Study on the Creation of Interactive Text Collage using Viewer Narratives (관람자 내러티브를 활용한 인터랙티브 텍스트 콜라주 창작 연구)

  • Lim, Sooyeon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.297-302
    • /
    • 2022
  • Contemporary viewers familiar with the digital space show their desire for self-expression and use voice, text and gestures as tools for expression. The purpose of this study is to create interactive art that expresses the narrative uttered by the viewer in the form of a collage using the viewer's figure, and reproduces and expands the story by the viewer's movement. The proposed interactive art visualizes audio and video information acquired from the viewer in a text collage, and uses gesture information and a natural user interface to easily and conveniently interact in real time and express personalized emotions. The three pieces of information obtained from the viewer are connected to each other to express the viewer's current temporary emotions. The rigid narrative of the text has some degree of freedom through the viewer's portrait images and gestures, and at the same time produces and expands the structure of the story close to reality. The artwork space created in this way is an experience space where the viewer's narrative is reflected, updated, and created in real time, and it is a reflection of oneself. It also induces active appreciation through the active intervention and action of the viewer.

Automatic Adaptation Based Metaverse Virtual Human Interaction (자동 적응 기반 메타버스 가상 휴먼 상호작용 기법)

  • Chung, Jin-Ho;Jo, Dongsik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.2
    • /
    • pp.101-106
    • /
    • 2022
  • Recently, virtual human has been widely used in various fields such as education, training, information guide. In addition, it is expected to be applied to services that interact with remote users in metaverse. In this paper, we propose a novel method to make a virtual human' interaction to perceive the user's surroundings. We use the editing authoring tool to apply user's interaction for providing the virtual human's response. The virtual human can recognize users' situations based on fuzzy, present optimal response to users. With our interaction method by context awareness to address our paper, the virtual human can provide interaction suitable for the surrounding environment based on automatic adaptation.

Design And Implementation of a Speech Recognition Interview Model based-on Opinion Mining Algorithm (오피니언 마이닝 알고리즘 기반 음성인식 인터뷰 모델의 설계 및 구현)

  • Kim, Kyu-Ho;Kim, Hee-Min;Lee, Ki-Young;Lim, Myung-Jae;Kim, Jeong-Lae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.225-230
    • /
    • 2012
  • The opinion mining is that to use the existing data mining technology also uploaded blog to web, to use product comment, the opinion mining can extract the author's opinion therefore it not judge text's subject, only judge subject's emotion. In this paper, published opinion mining algorithms and the text using speech recognition API for non-voice data to judge the emotions suggested. The system is open and the Subject associated with Google Voice Recognition API sunwihwa algorithm, the algorithm determines the polarity through improved design, based on this interview, speech recognition, which implements the model.

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS (대용량 한국어 TTS의 결정트리기반 음성 DB 감축 방안)

  • Lee, Jung-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.7
    • /
    • pp.91-98
    • /
    • 2010
  • Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. Because the improvements in the natualness, personality, speaking style, emotions of synthetic speech need the increase of the size of speech DB, it is necessary to prune the redundant speech segments in a large speech segment DB. In this paper, we propose a new method to construct a segmental speech DB for the Korean TTS system based on a clustering algorithm to downsize the segmental speech DB. For the performance test, the synthetic speech was generated using the Korean TTS system which consists of the language processing module, prosody processing module, segment selection module, speech concatenation module, and segmental speech DB. And MOS test was executed with the a set of synthetic speech generated with 4 different segmental speech DBs. We constructed 4 different segmental speech DB by combining CM1(or CM2) tree clustering method and full DB (or reduced DB). Experimental results show that the proposed method can reduce the size of speech DB by 23% and get high MOS in the perception test. Therefore the proposed method can be applied to make a small sized TTS.

A Research of Optimized Metadata Extraction and Classification of in Audio (미디어에서의 오디오 메타데이터 최적화 추출 및 분류 방안에 대한 연구)

  • Yoon, Min-hee;Park, Hyo-gyeong;Moon, Il-Young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.147-149
    • /
    • 2021
  • Recently, the rapid growth of the media market and the expectations of users have been increasing. In this research, tags are extracted through media-derived audio and classified into specific categories using artificial intelligence. This category is a type of emotion including joy, anger, sadness, love, hatred, desire, etc. We use JupyterNotebook to conduct the corresponding study, analyze voice data using the LiBROSA library within JupyterNotebook, and use Neural Network using keras and layer models.

  • PDF