• Title/Summary/Keyword: Emotional Speech Recognition

Search Result 69, Processing Time 0.026 seconds

Voice Recognition Chatbot System for an Aging Society: Technology Development and Customized UI/UX Design (고령화 사회를 위한 음성 인식 챗봇 시스템 : 기술 개발과 맞춤형 UI/UX 설계)

  • Yun-Ji Jeong;Min-Seong Yu;Joo-Young Oh;Hyeon-Seok Hwang;Won-Whoi Hun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.4
    • /
    • pp.9-14
    • /
    • 2024
  • This study developed a voice recognition chatbot system to address depression and loneliness among the elderly in an aging society. The system utilizes the Whisper model, GPT 2.5, and XTTS2 to provide high-performance voice recognition, natural language processing, and text-to-speech conversion. Users can express their emotions and states and receive appropriate responses, with voice recognition functionality using familiar voices for comfort and reassurance. The UX/UI design considers the cognitive responses, visual impairments, and physical limitations of the smart senior generation, using high contrast colors and readable fonts for enhanced usability. This research is expected to improve the quality of life for the elderly through voice-based interfaces.

Robust Speech Recognition for Emotional Variation (감정 변화에 강인한 음성 인식)

  • Kim, Won-Gu
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.431-434
    • /
    • 2007
  • 본 논문에서는 인간의 감정 변화의 영향을 적게 받는 음성 인식 시스템의 특정 파라메터에 관한 연구를 수행하였다. 이를 위하여 우선 다양한 감정이 포함된 음성 데이터베이스를 사용하여 감정 변화가 음성 인식 시스템의 성능에 미치는 영향과 감정 변화의 영향을 적게 받는 특정 파라메터에 관한 연구를 수행하였다. 본 연구에서는 LPC 켑스트럼 계수, 멜 켑스트럼 계수, 루트 켑스트럼 계수, PLP 계수와 RASTA 처리를 한 멜 켑스트럼 계수와 음성의 에너지를 사용하였다. 또한 음성에 포함된 편의(bias)를 제거하는 방법으로 CMS 와 SBR 방법을 사용하여 그 성능을 비교하였다. HMM 기반의 화자독립 단어 인식기를 사용한 실험 결과에서 RASTA 멜 켑스트럼과 델타 켑스트럼을 사용하고 신호편의 제거 방법으로 CMS를 사용한 경우에 가장 우수한 성능을 나타내었다. 이러한 것은 멜 켑스트럼을 사용한 기준 시스템과 비교하여 59%정도 오차가 감소된 것이다.

  • PDF

Recognition of Emotional states in Speech using Hidden Markov Model (HMM을 이용한 음성에서의 감정인식)

  • Kim, Sung-Ill;Lee, Sang-Hoon;Shin, Wee-Jae;Park, Nam-Chun
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.10a
    • /
    • pp.560-563
    • /
    • 2004
  • 본 논문은 분노, 행복, 평정, 슬픔, 놀람 둥과 같은 인간의 감정상태를 인식하는 새로운 접근에 대해 설명한다. 이러한 시도는 이산길이를 포함하는 연속 은닉 마르코프 모델(HMM)을 사용함으로써 이루어진다. 이를 위해, 우선 입력음성신호로부터 감정의 특징 파라메타를 정의 한다. 본 연구에서는 피치 신호, 에너지, 그리고 각각의 미분계수 등의 운율 파라메타를 사용하고, HMM으로 훈련과정을 거친다. 또한, 화자적응을 위해서 최대 사후확률(MAP) 추정에 기초한 감정 모델이 이용된다. 실험 결과, 음성에서의 감정 인식률은 적응 샘플수의 증가에 따라 점차적으로 증가함을 보여준다.

  • PDF

Generating Speech feature vectors for Effective Emotional Recognition (효과적인 감정인식을 위한 음성 특징 벡터 생성)

  • Sim, In-woo;Han, Eui Hwan;Cha, Hyung Tai
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.687-690
    • /
    • 2019
  • 본 논문에서는 효과적인 감정인식을 위한 효과적인 특징 벡터를 생성한다. 이를 위해서 음성 데이터 셋 RAVDESS를 이용하였으며, 그 중 neutral, calm, happy, sad 총 4가지 감정을 나타내는 음성 신호를 사용하였다. 본 논문에서는 기존에 감정인식에 사용되는 MFCC1~13차 계수와 pitch, ZCR, peakenergy 중에서 효과적인 특징을 추출하기 위해 클래스 간, 클래스 내 분산의 비를 이용하였다. 실험결과 감정인식에 사용되는 특징 벡터들 중 peakenergy, pitch, MFCC2, MFCC3, MFCC4, MFCC12, MFCC13이 효과적임을 확인하였다.

Recognition of Emotional states in speech using combination of Unsupervised Learning with Supervised Learning (비감독 학습과 감독학습의 결합을 통한 음성 감정 인식)

  • Bae, Sang-Ho;Lee, Jang-Hoon;Kim, Hyun-jung;Won, Il-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.391-394
    • /
    • 2011
  • 사용자의 감정을 자동으로 인식하는 연구는 사용자 중심의 서비스를 제공할 때 중요한 요소이다. 인간은 하나의 감정을 다양하게 분류하여 인식한다. 그러나 기계학습을 통해 감정을 인식하려고 할 때 감정을 단일값으로 취급하는 방법만으로는 좋은 성능을 기대하기 어렵다. 따라서 본 논문에서는 비감독 학습과 감독학습을 결합한 감정인식 모델을 제시하였다. 제안된 모델의 핵심은 비감독 학습을 이용하여 인간처럼 한 개의 감정을 다양한 하부 감정으로 분류하고, 이렇게 분류된 감정을 감독학습을 통해 성능을 향상 시키는 것이다.

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances (발화 내 감정의 정밀한 인식을 위한 한국어 문미억양의 활용)

  • Jang In-Chang;Lee Tae-Seung;Park Mikyoung;Kim Tae-Soo;Jang Dong-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.505-511
    • /
    • 2005
  • Autonomic machines interacting with human should have capability to perceive the states of emotion and attitude through implicit messages for obtaining voluntary cooperation from their clients. Voice is the easiest and most natural way to exchange human messages. The automatic systems capable to understanding the states of emotion and attitude have utilized features based on pitch and energy of uttered sentences. Performance of the existing emotion recognition systems can be further improved withthe support of linguistic knowledge that specific tonal section in a sentence is related with the states of emotion and attitude. In this paper, we attempt to improve recognition rate of emotion by adopting such linguistic knowledge for Korean ending boundary tones into anautomatic system implemented using pitch-related features and multilayer perceptrons. From the results of an experiment over a Korean emotional speech database, the improvement of $4\%$ is confirmed.

The Effect of Reminiscence Therapy on Communication Ability of Elderly Patient With Alzheimer's Dementia (회상하기 프로그램이 알츠하이머 노인의 의사소통 능력에 미치는 영향)

  • Kim, Soo-Jung;Chang, Hyun-Jin
    • Therapeutic Science for Rehabilitation
    • /
    • v.9 no.4
    • /
    • pp.21-31
    • /
    • 2020
  • Objective : Reminiscence program is a method to restore psychological stability for the elderly having dementia, and at the same time it makes the aged with dementia express themselves correctly by helping them to recollect their past life memories. The purpose of this study was to investigate the effects on communication ability in applying the reminiscence program to elderly patients with Alzheimer's dementia. Methods : The subject were 4 patients whose Alzheimer's dementia of moderate. This experiment was designed with pre-stage, treatment-stage, post-stage. The reminiscence therapy was compose of reminiscence activities of their live; in their childhood, adolescence, adulthood, and senescence. The therapy was delivered 30 times for 15 weeks. Results : The result of the study were as follows. First, after reminiscence therapy, recognition ability was improved. Second, after reminiscence therapy, emotional side was improved. Third, after reminiscence therapy, communication ability was improved. Conclusion : In this study, the reminiscence therapy had a positive effect on the improvement of communication skills among the elderly with Alzheimer's dementia. Based on the reminiscence therapy, it is thought to be very helpful in improving the communication ability of the elderly with dementia in the future.

Study on function evaluation tools for stroke patients (뇌졸중(腦卒中) 환자(患者)의 기능평가방법(機能評價方法)에 대(對)한 연구(硏究))

  • Ko, Seong-Gyu;Ko, Chang-Nam;Chox, Ki-Ho;Kim, Young-Suk;Bae, Hyung-Sup;Lee, Kyung-Sup
    • The Journal of Korean Medicine
    • /
    • v.17 no.1 s.31
    • /
    • pp.48-83
    • /
    • 1996
  • Our conclusions for function evaluation tools of Stroke patients are as follows. 1. Evaluating tools of Activities of Daily Living, Katz Index, Barthel Index, Modified Barthel Index have high validity and reliability because of ease of measuring, high accuracy, consistency, sensitivity and sufficient stastistics, but they mainly measure motor function except sense, mentation, language, and social conception. Therefore cerebrovascular disease and brain injury in trauma patients with lacked acknowledgement and sensation, we are not able to apply these tools. 2. PULSES Profile is a useful scale for measuring the patient's over-all status, upper and lower limb functions, sensory components, excretary functions, and intellectual and emotional adaptabilities. It is recognized as a good, useful tool to evaluate patient's whole function. 3. Motor Assessment Scale was designed to measure the progress of stroke patients. The scale was supplemented with upper arm function items. We believe that the Motor Assessment Scale could be a useful evaluation tool with inter-rater reliability ,test-retest reliability. 4. The existing evaluation tools, Katz Index, Barthel Index, Modified Barthel Index, PULSES Profile, Motor Assessment Scale, mainly measured the rehabilitational motor function of sequela of cerebrovascular patients. On the other hand CNS & INH stroke scale can measure cerebrovascular disease patient's neurologic deficits and over-all stautus, which are recognition ability, speech status, motor function, sensory function, activities of daily living. Those scales have been recognized as useful tools to measure function of cerebrovascular disease patients and have increased in use. 5. Every function evaluation tool was recognized to have some validity and inter-rater, test-retest reliability in items of each evaluation tool and total scores of each evaluation tools, but it is thought that none of these scales have been fully validated and proved reliable. Therefore afterward, the development of a highly reliable rating system may best be accomplished by a careful comparison of several tools, using the same patients and the same observers in order to choose the most reliable items from each. 6. Ideal evaluation tools must have the following conditions; (1) It should show the objective functional statues at the same time. (2) It should be repeated consecutively to know changed function status. (3) It should be easy to observe the treatment program. (4) It should have the same result with another rater to help rater exchange information with treatment team members. (5) It should be practical and simple. (6) The patient should not suffer from the observer.

  • PDF

Increase of Spoken Number of Syllables Using MIT(Melody Intonation Therapy) : Case Studies on older adult with stroke and aphasia (MIT(Melodic Intonation Therapy) 중심의 음악활동을 이용한 실어증을 가진 뇌졸중 노인의 음절 수 증가에 대한 사례 연구)

  • Hong, Do Kyoung
    • Journal of Music and Human Behavior
    • /
    • v.2 no.2
    • /
    • pp.57-67
    • /
    • 2005
  • Most of stroke patients have not only physical difficulty but speech and neurological disorder because of hemiplegia and such unexpected changes cause psychologic disadaptability and absent-mindedness. Particularly, lowering of physical ability can lead to serious emotional problem from failure or frustration in daily life. Generally, treatment of patient with stroke put emphasis on physical rehabilitation but actually this patient had considerable speech disorder such as aphasia or articulation disorder. Moreover, failing of recognition function, mental disorder as hypochondria, and even visual and auditory disorder are represented. So it is effective to integrate verbal remediation and other treatments in medical care environment. In particular, patients with language disorder very often wither psychologically therefore it is efficient to use of music therapy that gives opulent emotion to aphasia patients. And primarily to investigate the effects of 10 sessions treatments; change in spoken total number of syllables, to confirm their own value by success of given task and reassure about themselves ability. All of 10 sessions stages were scored by MIT manual and its improvement were measured, that is, accomplishment was analyzed within each level in order to prove detail change of spoken total number of syllables. The result of this program organized from 2 syllables to 4 syllables is summarized as follows. Subject A completed in preliminary stage Level I, in 2 syllables case advanced to Level III in fifth session and to Level IV in seventh session, in 3 syllables case advanced to Level III in seventh session and to Level IV in ninth session, and in 4 syllables case showed 8% low success rate in first session but after repeated practice increased considerably in sixth session and in advanced to Level III in eighth session to Level IV in tenth session. Subject B also completed in preliminary stage Level I, in 2 syllables case advanced to Level III in forth session and to Level IV in sixth session, in 3 syllables case advanced to Level III in fifth session and to Level IV in seventh session, and in 4 syllables case showed 10% low success rate in first session and increased considerably in fifth session and in advanced to Level III in seventh session but could not reach to Level IV until tenth session. As a result, it was shown that music therapy using MIT was not statistically meaningful but improved spoken total number of syllables and success rate of task had improved as a whole. Therefore, music intervention using MIT it has positive affect on verbal ability of patients with Broca's Aphasia and their language rehabilitation.

  • PDF